(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 
International Bureau 

(43) International Publication Date 
20 February 2003 (20.02.2003) 




PCT 



II llll III III MM INN II I II 



(10) International Publication Number 

WO 03/013227 A2 



(51) International Patent Classification 7 : A01H 

(21) International Application Number: PCT/US02/25805 

(22) International Filing Date: 9 August 2002 (09.08.2002) 



(25) Filing Language: 

(26) Publication Language: 



English 
English 



94402 (US). CREELMAN, Robert, A. [US/US]; 2801 
Jennifer Drive, Castro Valley, CA 94546 (US). PINEDA, 
Omaira [CO/US]; 4060 9th Place, Vero Beach, FL 32960 
(US). YU, Guo-Liang [US/US]; 242 Gravett Drive, Berke- 
ley, CA 94705-1531 (US). BROUN, Pierre, E. [FR/US]; 
921 Sunnybrae Boulevard, San Mateo, CA 94402 (US). 

(74) Agents: WARD, Michael, R. et ah; Morrison & Foerster 
LLP, 425 Market Street, San Francisco, CA 94105-2482 
(US). 



(30) Priority Data: 

60/310,847 
60/336,049 
60/338,692 
10/171,468 



9 August 2001 (09.08.2001) US 

19 November 2001 (19.11.2001) US 

11 December 2001 (11.12.2001) US 

14 June 2002 (14.06.2002) US 



(71) Applicant (for all designated States except US): 
MENDEL BIOTECHNOLOGY, INC. [US/US]; 21375 
Cabot Boulevard, Hayward, CA 94945 (US). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): RATCLIFFE, 
Oliver [GB/US]; 814 East 21st Street, Oakland, CA 
94606 (US). RIECHMANN, Jose, Luis [ES/US]; 518 
S. El Molino Avenue, #308, Pasadena, CA 91101 (US). 
ADAM, Luc, J. [CA/US]; 25800 Industrial Boulevard, 
Apt. L403, Hayward, CA 94545 (US). DUBELL, Arnold, 
T. [US/US]; 14857 Wake Avenue, San Lcandro, CA 94578 
(US). HEARD, Jacqueline, E. [US/US]; 810 Guilford 
Avenue, San Mateo, CA 94402 (US). PILGRIM, Mar- 
sha, L. [US/US]; 1368 Patrick Henry Drive, Phoenixville, 
PA 19460 (US). JIANG, Cai-Zhong [US/US]; 34495 
Heathrow Terrace, Fremont, CA 94555 (US). REUBER, 
X, Lynne [US/US]; 1115 S. Grant Street, San Mateo, CA 



(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, 
GM, ITR, IIU, ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SD, SE, SG, 
SI, SK, SL, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, 
VN, YU, ZA, ZM, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), 
European patent (AT, BE, BG, CH, CY, CZ, DE, DK, EE, 
ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE, SK, 
TR), OAPI patent (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, 
GW, ML, MR, NE, SN, TD, TG). 

Published: 

— without international search report and to be republished 
upon receipt of that report 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations " appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



< 



(54) Title: YIELD-RELATED POLYNUCLEOTIDES AND POLYPEPTIDES IN PLANTS 

(57) Abstract: The invention relates to plant transcription factor polypeptides, polynucleotides that encode them, homologs from a 
variety of plant species, and methods of using the polynucleotides and polypeptides to produce transgenic plants having advantageous 
properties compared to a reference plant. Sequence information related to these polynucleotides and polypeptides can also be used 
in bioinformatic search methods and is also disclosed. 



WO 03/013227 PCT/US02/25805 
YIELD-RELATED POLYNUCLEOTIDES AND POLYPEPTIDES IN PLANTS 



This application claims the benefit of US Provisional Application No. 
60/310,847, filed August 9, 2001, US Provisional Application No. 60/336,049, filed 
December 5, 2001, US Provisional Application No. 60/338,692, filed December 11, 
2001, and US Non-provisional Application No. 10/171,468, filed June 14, 2002, the 
entire contents of which are hereby incorporated by reference. 

FIELD OF THE INVENTION 

This invention relates to the field of plant biology. More particularly, the 
present invention pertains to compositions and methods for phenotypically modifying 
a plant. 

INTRODUCTION 

A plant's traits, such as its biochemical, developmental, or phenotypic 
characteristics, may be controlled through a number of cellular processes. One 
important way to manipulate that control is through transcription factors - proteins 
that influence the expression of a particular gene or sets of genes. Transformed and 
transgenic plants that comprise cells having altered levels of at least one selected 
transcription factor, for example, possess advantageous or desirable traits. Strategies 
for manipulating traits by altering a plant cell's transcription factor content can 
therefore result in plants and crops with commercially valuable properties. Applicants 
have identified polynucleotides encoding transcription factors, developed numerous 
transgenic plants using these polynucleotides, and have analyzed the plants for a 
variety of important traits. In so doing, applicants have identified important 
polynucleotide and polypeptide sequences for producing commercially valuable 
plants and crops as well as the methods for making them and using them. Other 
aspects and embodiments of the invention are described below and can be derived 
from the teachings of this disclosure as a whole. 

BACKGROUND OF THE INVENTION 

Transcription factors (TFs) can modulate gene expression, either increasing or 
decreasing (inducing or repressing) the rate of transcription. This modulation results 
in differential levels of gene expression at various developmental stages, in different 
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tissues and cell types, and in response to different exogenous (e.g., environmental) 
and endogenous stimuli throughout the life cycle of the organism. 

Because transcription factors are key controlling elements of biological 
pathways, altering the expression levels of one or more transcription factors can 
change entire biological pathways in an organism. For example, manipulation of the 
levels of selected transcription factors may result in increased expression of 
economically useful proteins or metabolic chemicals in plants or to improve other 
agriculturally relevant characteristics. Conversely, blocked or reduced expression of a 
transcription factor may reduce biosynthesis of unwanted compounds or remove an 
undesirable trait. Therefore, manipulating transcription factor levels in a plant offers 
tremendous potential in agricultural biotechnology for modifying a plant's traits. 

The present invention provides novel transcription, factors useful for 
modifying a plant's phenotype in desirable ways. 

SUMMARY OF THE INVENTION 

In a first aspect, the invention relates to a recombinant polynucleotide 
comprising a nucleotide sequence selected from the group consisting of: (a) a 
nucleotide sequence encoding a polypeptide comprising a polypeptide sequence 
selected from those of the Sequence Listing, SEQ ID NOs:2 to 2N, where N = 2-561, 
or those listed in Table 4, or a complementary nucleotide sequence thereof; (b) a 
nucleotide sequence encoding a polypeptide comprising a variant of a polypeptide of 
(a) having one or more, or between 1 and about 5, or between 1 and about 10, or 
between 1 and about 30, conservative amino acid substitutions; (c) a nucleotide 
sequence comprising a sequence selected from those of SEQ ED NOs:l to (2N - 1), 
where N = 2-561, or those included in Table 4, or a complementary nucleotide 
sequence thereof; (d) a nucleotide sequence comprising silent substitutions in a 
nucleotide sequence of (c); (e) a nucleotide sequence which hybridizes under stringent 
conditions over substantially the entire length of a nucleotide sequence of one or more 
of: (a), (b), (c), or (d); (f) a nucleotide sequence comprising at least 10 or 15, or at 
least about 20, or at least about 30 consecutive nucleotides of a sequence of any of 
(a)-(e), or at least 10 or 15, or at least about 20, or at least about 30 consecutive 
nucleotides outside of a region encoding a conserved domain of any of (a)-(e); (g) a 
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nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide having a biological activity that 
modifies a plant's characteristic, functions as a transcription factor, or alters the level 
of transcription of a gene or transgene in a cell; (h) a nucleotide sequence having at 
least 31% sequence identity to a nucleotide sequence of any of (a)-(g); (i) a 
nucleotide sequence having at least 60%, or at least 70 %, or at least 80 %, or at least 
90 %, or at least 95 % sequence identity to a nucleotide sequence of any of (a)-(g) or a 
10 or 15 nucleotide, or at least about 20, or at least about 30 nucleotide region of a 
sequence of (a)-(g) that is outside of a region encoding a conserved domain; (j) a 
nucleotide sequence that encodes a polypeptide having at least 31% sequence identity 
to a polypeptide listed in Table 4, or the Sequence Listing; (k) a nucleotide sequence 
which encodes a polypeptide having at least 60%, or at least 70 %, or at least 80%, or 
at least 90 %, or at least 95 % sequence identity to a polypeptide listed in Table 4, or 
the Sequence Listing; and (1) a nucleotide sequence that encodes a conserved domain 
of a polypeptide having at least 85%, or at least 90%, or at least 95%, or at least 98% 
sequence identity to a conserved domain of a polypeptide listed in Table 4, or the 
Sequence Listing. The recombinant polynucleotide may further comprise a 
constitutive, inducible, or tissue-specific promoter operably linked to the nucleotide 
sequence. The invention also relates to compositions comprising at least two of the 
above-described polynucleotides. 

In a second aspect, the invention comprises an isolated or recombinant 
polypeptide comprising a subsequence of at least about 10, or at least about 15, or at 
least about 20, or at least about 30 contiguous amino acids encoded by the 
recombinant or isolated polynucleotide described above, or comprising a subsequence 
of at least about 8, or at least about 12, or at least about 15, or at least about 20, or at 
least about 30 contiguous amino acids outside a conserved domain. 

In a third aspect, the invention comprises an isolated or recombinant 
polynucleotide that encodes a polypeptide that is a paralog of the isolated polypeptide 
described above. In one aspect, the invention is an paralog which, when expressed in 
Arabidopsis, modifies a trait of the Arabidopsis plant. 
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In a fourth aspect, the invention comprises an isolated or recombinant 
polynucleotide that encodes a polypeptide that is an ortholog of the isolated 
polypeptide described above. In one aspect, the invention is an ortholog which, when 
expressed in Arabidopsis, modifies a trait of the Arabidopsis plant. 

In a fifth aspect, the invention comprises an isolated polypeptide that is a 
paralog of the isolated polypeptide described above. In one aspect, the invention is an 
paralog which, when expressed in Arabidopsis, modifies a trait of the Arabidopsis 
plant. 

In a sixth aspect, the invention comprises an isolated polypeptide that is an 
ortholog of the isolated polypeptide described above. In one aspect, the invention is 
an ortholog which, when expressed in Arabidopsis, modifies a trait of the i . 

Arabidopsis plant. 

The present invention also encompasses transcription factor variants. A 
preferred transcription factor variant is one having at least 40% amino acid sequence 
identity, a more preferred transcription factor variant is one having at least 50% amino 
acid sequence identity and a most preferred transcription factor variant is one having 
at least 65% amino acid sequence identity to the transcription factor amino acid 
sequence SEQ ID NOs:2 to 2N, where N = 2-561, and which contains at least one 
functional or structural characteristic of the transcription factor amino acid sequence. 
Sequences having lesser degrees of identity but comparable biological activity are 
considered to be equivalents. 

In another aspect, the invention is a transgenic plant comprising one or more 
of the above-described isolated or recombinant polynucleotides. In yet another 
aspect, the invention is a plant with altered expression levels of a polynucleotide 
described above or a plant with altered expression or activity levels of an above- 
described polypeptide. Further, the invention is a plant lacking a nucleotide sequence 
encoding a polypeptide described above or substantially lacking a polypeptide 
described above. The plant may be any plant, including, but not limited to, 
Arabidopsis, mustard, soybean, wheat, corn, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, 
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raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, 
honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, vegetable 
brassicas, and mint or other labiates. In yet another aspect, the inventions is an 
isolated plant material of a plant, including, but not limited to, plant tissue, fruit, seed, 
plant cell, embryo, protoplast, pollen, and the like. In yet another aspect, the 
invention is a transgenic plant tissue culture of regenerable cells, including, but not 
limited to, embryos, meristematic cells, microspores, protoplast, pollen, and the like. 

In yet another aspect the invention is a transgenic plant comprising one or 
more of the above described polynucleotides wherein the encoded polypeptide is r 
expressed and regulates transcription of a gene. 

In a further aspect the invention provides a method of using the polynucleotide 
composition to breed a progeny plant from a transgenic plant including crossing 
plants, producing seeds from transgenic plants, and methods of breeding using 
transgenic plants, the method comprising transforming a plant with the polynucleotide 
composition to create a transgenic plant, crossing the transgenic plant with another 
plant, selecting seed, and growing the progeny plant from the seed. 

In a further aspect, the invention provides a progeny plant derived from a 
parental plant wherein said progeny plant exhibits at least three fold greater 
messenger RNA levels than said parental plant, wherein the messenger RNA encodes 
a DNA-binding protein which is capable of binding to a DNA regulatory sequence 
and inducing expression of a plant trait gene, wherein the progeny plant is 
characterized by a change in the plant trait compared to said parental plant. In yet a 
further aspect, the progeny plant exhibits at least ten fold greater messenger RNA 
levels compared to said parental plant. In yet a further aspect, the progeny plant 
exhibits at least fifty fold greater messenger RNA levels compared to said parental 
plant. 

In a further aspect, the invention relates to a cloning or expression vector 
comprising the isolated or recombinant polynucleotide described above or cells 
comprising the cloning or expression vector. 
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In yet a further aspect, the invention relates to a composition produced by 
incubating a polynucleotide of the invention with a nuclease, a restriction enzyme, a 
polymerase; a polymerase and a primer; a cloning vector, or with a cell. 

Furthermore, the invention relates to a method for producing a plant having a 
modified trait. The method comprises altering the expression of an isolated or 
recombinant polynucleotide of the invention or altering the expression or activity of a 
polypeptide of the invention in a plant to produce a modified plant, and selecting the 
modified plant for a modified trait. In one aspect, the plant is a monocot plant. In 
another aspect, the plant is a dicot plant. In another aspect the recombinant 
polynucleotide is from a dicot plant and the plant is a monocot plant. In yet another 
aspect the recombinant polynucleotide is from a monocot plant and the plant is a dicot 
plant. In yet another aspect the recombinant polynucleotide is from a monocot plant 
and the plant is a monocot plant. In yet another aspect the recombinant 
polynucleotide is from a dicot plant and the plant is a dicot plant. 

In another aspect, the invention is a transgenic plant comprising an isolated or 
recombinant polynucleotide encoding a polypeptide wherein the polypeptide is 
selected from the group consisting of SEQ ID NOs: 2 - 2N, where N = 2-561. In yet 
another aspect, the invention is a plant with altered expression levels of a polypeptide 
described above or a plant with altered expression or activity levels of an above- 
described polypeptide. Further, the invention is a plant lacking a polynucleotide 
sequence encoding a polypeptide described above or substantially lacking a 
polypeptide described above. The plant may be any plant, including, but not limited 
to, Arabidopsis, mustard, soybean, wheat, corn, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, 
raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, 
honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, vegetable 
brassicas, and mint or other labiates. In yet another aspect, the inventions is an 
isolated plant material of a plant, including, but not limited to, plant tissue, fruit, seed, 
plant cell, embryo, protoplast, pollen, and the like. In yet another aspect, the 
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invention is a transgenic plant tissue culture of regenerate cells, including, but not 
limited to, embryos, meristematic cells, microspores, protoplast, pollen, and the like. 



In another aspect, the invention relates to a method of identifying a factor that 
is modulated by or interacts with a polypeptide encoded by a polynucleotide of the 
invention. The method comprises expressing a polypeptide encoded by the 
polynucleotide in a plant; and identifying at least one factor that is modulated by or 
interacts with the polypeptide. In one embodiment the method for identifying 
modulating or interacting factors is by detecting binding by the polypeptide to a 
promoter sequence, or by detecting interactions between an additional protein and the 
polypeptide in a yeast two hybrid system, or by detecting expression of a factor by 
hybridization to a microarray, subtractive hybridization, or differential display. 

In yet another aspect, the invention is a method of identifying a molecule that 
modulates activity or expression of a polynucleotide or polypeptide of interest. The 
method comprises placing the molecule in contact with a plant comprising the 
polynucleotide or polypeptide encoded by the polynucleotide of the invention and 
monitoring one or more of the expression level of the polynucleotide in the plant, the 
expression level of the polypeptide in the plant, and modulation of an activity of the 
polypeptide in the plant. 

In yet another aspect, the invention relates to an integrated system, computer 
or computer readable medium comprising one or more character strings 
corresponding to a polynucleotide of the invention, or to a polypeptide encoded by the 
polynucleotide. The integrated system, computer or computer readable medium may 
comprise a link between one or more sequence strings to a modified plant trait. 

In yet another aspect, the invention is a method for identifying a sequence 
similar or homologous to one or more polynucleotides of the invention, or one or 
more polypeptides encoded by the polynucleotides. The method comprises providing 
a sequence database, and querying the sequence database with one or more target 
sequences corresponding to the one or more polynucleotides or to the one or more 
polypeptides to identify one or more sequence members of the database that display 
sequence similarity or homology to one or more of the one or more target sequences. 
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The method may further comprise of Unking the one or more of the 
polynucleotides of the invention, or encoded polypeptides, to a modified plant 
phenotype. 

BRIEF DESCRIPTION OF THE SEQUENCE LISTING, 
TABLES, AND FIGURE 
The Sequence Listing provides exemplary polynucleotide and polypeptide 
sequences of the invention. The traits associated with the use of the sequences are 
included in the Examples. 

Diskettel is a read-only memory computer-readable diskette and contains a 
copy of the Sequence Listing in ASCII text format. The Sequence Listing is named 
"SEQLIST5 14442002041" and is 929 kilobytes in size. The copy of the Sequence 
Listing on the diskette is hereby incorporated by reference in its entirety. 

Table 4 shows the polynucleotides and polypeptides identified by SEQ ID 
NO; Mendel Gene ID No.; conserved domain of the polypeptide; and if the 
polynucleotide was tested in a transgenic assay. The first column shows the 
polynucleotide SEQ ID NO; the second column shows the Mendel Gene ID No., GID; 
the third column shows the trait(s) resulting from the knock out or overexpression of 
the polynucleotide in the transgenic plant; the fourth column shows the category of 
the trait; the fifth column shows the transcription factor family to which the 
polynucleotide belongs; the sixth column ("Comment"), includes specific effects and 
utilities conferred by the polynucleotide of the first column; the seventh column 
shows the SEQ ID NO of the polypeptide encoded by the polynucleotide; and the 
eighth column shows the amino acid residue positions of the conserved domain in 
amino acid (AA) co-ordinates. 

Table 5 lists a summary of orthologous and homologous sequences identified 
using BLAST (tblastx program). The first column shows the polynucleotide sequence 
identifier (SEQ ID NO), the second column shows the corresponding cDNA identifier 
(Gene ID) 9 the third column shows the orthologous or homologous polynucleotide 
GenBank Accession Number (Test Sequence ID), the fourth column shows the 
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calculated probability value that the sequence identity is due to chance (Smallest Sum 
Probability), the fifth column shows the plant species from which the test sequence 
was isolated (Test Sequence Species), and the sixth column shows the orthologous or 
homologous test sequence GenBank annotation (Test Sequence GenBank 
Annotation). 

Figure 1 shows a phylogenic tree of related plant families adapted from Daly 
et al. (2001 Plant Physiology 127:1328-1333). 

Detailed Description of Exemplary Embodiments 

In an important aspect, the present invention relates to polynucleotides and 
polypeptides, e.g. for modifying phenotypes of plants. Throughout this disclosure, 
various information sources are referred to and/or are specifically incorporated. The 
information sources include scientific journal articles, patent documents, textbooks, 
and World Wide Web browser-inactive page addresses, for example. While the 
reference to these information sources clearly indicates that they can be used by one 
of skill in the art, applicants specifically incorporate each and every one of the 
information sources cited herein, in their entirety, whether or not a specific mention of 
"incorporation by reference" is noted. The contents and teachings of each and every 
one of the information sources can be relied on and used to make and use 
embodiments of the invention. 

It must be noted that as used herein and in the appended claims, the singular 
forms "a," "an," and "the" include plural reference unless the context clearly dictates 
otherwise. Thus, for example, a reference to "a plant" includes a plurality of such 
plants, and a reference to "a stress" is a reference to one or more stresses and 
equivalents thereof known to those skilled in the art, and so forth. 

The polynucleotide sequences of the invention encode polypeptides that are 
members of well-known transcription factor families, including plant transcription 
factor families, as disclosed in Table 4. Generally, the transcription factors encoded 
by the present sequences are involved in cell differentiation and proliferation and the 
regulation of growth. Accordingly, one skilled in the art would recognize that by 
expressing the present sequences in a plant, one may change the expression of 
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autologous genes or induce the expression of introduced genes. By affecting the 
expression of similar autologous sequences in a plant that have the biological activity 
of the present sequences, or by introducing the present sequences into a plant, one 
may alter a plant's phenotype to one with improved traits. The sequences of the 
invention may also be used to transform a plant and introduce desirable traits not 
found in the wild-type cultivar or strain. Plants may then be selected for those that 
produce the most desirable degree of over- or underexpression of target genes of 
interest and coincident trait improvement. 

The sequences of the present invention may be from any species, particularly 
plant species, in a naturally occurring form or from any source whether natural, 
synthetic, semi-synthetic or recombinant. The sequences of the invention may also 
include fragments of the present amino acid sequences. In this context, a "fragment" 
refers to a fragment of a polypeptide sequence which is at least 5 to about 15 amino 
acids in length, most preferably at least 14 amino acids, and which retain some 
biological activity of a transcription factor. Where "amino acid sequence" is recited to 
refer to an amino acid sequence of a naturally occurring protein molecule, "amino 
acid sequence" and like terms are not meant to limit the amino acid sequence to the 
complete native amino acid sequence associated with the recited protein molecule. 

As one of ordinary skill in the art recognizes, transcription factors can be 
identified by the presence of a region or domain of structural similarity or identity to a 
specific consensus sequence or the presence of a specific consensus DNA-binding site 
or DNA-binding site motif (see, for example, Riechmann et al., (2000) Science 290: 
2105-21 10). The plant transcription factors may belong to one of the following 
transcription factor families: the AP2 (APETALA2) domain transcription factor 
family (Riechmann and Meyerowitz (1998) Biol Chem. 379:633-646); the MYB 
transcription factor family (Martin and Paz-Ares, (1997) Trends Genet 13:67-73); the 
MADS domain transcription factor family (Riechmann and Meyerowitz (1997) Biol 
Chem. 378:1079-1 101); the WRKY protein family (Ishiguro and Nakamura (1994) 
Mol Gen. Genet 244:563-571); the ankyrin-repeat protein family (Zhang et al. 
(1992) Plant Cell 4:1575-1588); the zinc finger protein (Z) family (Klug and Schwabe 
(1995) FASEB J. 9: 597-604); the homeobox (HB) protein family (Buerglin in 
Guidebook to the Homeobox Genes, Duboule (ed.) (1994) Oxford University Press); 
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the CAAT-element binding proteins (Forsburg and Guarente (1989) Genes Dev. 
3:1166-1178); the squamosa promoter binding proteins (SPB) (Klein et al. (1996) 
Mol. Gen. Genet. 1996 250:7-16); the NAM protein family (Souer et al. (1996) Cell 
85:159-170); the IAA/AUX proteins (Rouse et al. (1998) Science 279:1371-1373); 
the HLH/MYC protein family (Littlewood et al. (1994) Prot. Profile 1:639-709); the 
DNA-binding protein (DBP) family (Tucker et al. (1994) EMBO J. 13:2994-3002); 
the bZP family of transcription factors (Foster et al. (1994) FASEB J. 8: 192-200); the 
Box P-binding protein (the BPF-1) family (da Costa e Silva et al. (1993) Plant J. 
4:125-135); the high mobility group (HMG) family (Bustin and Reeves (1996) Prog. 
Nucl. Acids Res. Mol. Biol. 54:35-100); the scarecrow (SCR) family (Di Laurenzio et 
al. (1996) Cell 86:423-433); the GF14 family (Wu et al. (1997) Plant Physiol. 
1 14:1421-1431); the polycomb (PCOMB) family (Kennison (1995) Annu. Rev. Genet. - 
29:289-303); the teosinte branched (TEO) family (Luo et al. (1 996) Nature 383 :794- 
799; the ABI3 family (Giraudat et al. (1992) Plant Cell 4:1251-1261); the triple helix - 
(TH) family (Dehesh et al. (1990) Science 250:1397-1399); the EEL family (Chao et 
al. (1997) Cell 89:1 133-44); the AT-HOOK family (Reeves and Nissen (1990) J. Biol. 
Chem. 265:8573-8582); the S1FA family (Zhou et al. (1995) Nucleic Acids Res. 
23:1165-1169); the bZIPT2 family (Lu and Ferl (1995; Plant Physiol. 109:723); the 
YABBY family (Bowman et al. (1999) Development 126:2387-96); the PAZ family 
(Bohmert et al. (1998) EMBO J. 17:170-80); a family of miscellaneous (MISC) 
transcription factors including the DPBF family (Kim et al. (1997) Plant J. 1 1:1237- 
125 1) and the SPF1 family (Ishiguro and Nakamura (1 994) Mol. Gen. Genet. 
244:563-571); the golden (GLD) family (Hall et al. (1998) Plant Cell 10:925-936), 
the TUBBY family (Boggin et al, (1999) Science 286:21 19-2125), the heat shock 
family (Wu C (1 995) Annu Rev Cell Dev Biol 1 1 :441 -469), the ENBP family 
(Christiansen et al (1996) Plant Mol Biol 32:809-821), the RJNG-zinc family (Jensen 
et al. (1998; FEBS letters 436:283-287), the PDBP family (Janik et al Virology. 
(1989) 168:320-329), the PCF family (Cubas P, et al. Plant J. (1999) 18:215-22), the 
SRS (SHI-related) family (Fridborg et al Plant Cell (1999) 1 1:1019-1032), the CPP 
(cysteine-rich polycomb-like) family (Cvitanich et al Proc. Natl. Acad. Sci. USA. 
(2000) 97:8163-8168), the ARF (auxin response factor) family (Ulmasov, et al. 
(1999) Proc. Natl. Acad. Sci. USA 96: 5844-5849), the SWI/SNF family 
(Collingwood et al J. Mol. End. 23:255-275), the ACBF family (Seguin et al (1997) 
Plant Mol Biol. 35:281-291), PCGL (CG-1 like) family (da Costa e Silva et al. 
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(1994) Plant Mol BioL 25:921-924) the ARID family (Vazquez et al. (1999) 
Development 126: 733-42), the Jumonji family, Balciunas et al (2000, Trends 
Biochem Set IS: 274-276), the bZIP-NIN family (Schauser et al (1999) Nature 402: 
191-195), the E2F family Kaelin et al (1992) Cell 70: 351-364) and the GRF-like 
family (Knaap et al (2000) Plant Physiol 122: 695-704). As indicated by any part of 
the list above and as known in the art, transcription factors have been sometimes 
categorized by class, family, and sub-family according to their structural content and 
consensus DNA-binding site motif, for example. Many of the classes and many of the 
families and sub-families are listed here. However, the inclusion of one sub-family 
and not another, or the inclusion of one family and not another, does not mean that the 
invention does not encompass polynucleotides or polypeptides of a certain family or 
sub-family. The list provided here is merely an example of the types of transcription 
factors and the knowledge available concerning the consensus sequences and 
consensus DNA-binding site motifs that help define them as known to those of skill in 
the art (each of the references noted above are specifically incorporated herein by 
reference). A transcription factor may include, but is not limited to, any polypeptide 
that can activate or repress transcription of a single gene or a number of genes. This 
polypeptide group includes, but is not limited to, DNA-binding proteins, DNA- 
binding protein binding proteins, protein kinases, protein phosphatases, GTP-binding 
proteins, and receptors, and the like. 

In addition to methods for modifying a plant phenotype by employing one or 
more polynucleotides and polypeptides of the invention described herein, the 
polynucleotides and polypeptides of the invention have a variety of additional uses. 
These uses include their use in the recombinant production (i.e., expression) of 
proteins; as regulators of plant gene expression, as diagnostic probes for the presence 
of complementary or partially complementary nucleic acids (including for detection 
of natural coding nucleic acids); as substrates for further reactions, e.g., mutation 
reactions, PCR reactions, or the like; as substrates for cloning e.g., including digestion 
or ligation reactions; and for identifying exogenous or endogenous modulators of the 
transcription factors. A "polynucleotide" is a nucleic acid sequence comprising a 
plurality of polymerized nucleotides, e.g., at least about 15 consecutive polymerized 
nucleotides, optionally at least about 30 consecutive nucleotides, at least about 50 
consecutive nucleotides. In many instances, a polynucleotide comprises a nucleotide 
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sequence encoding a polypeptide (or protein) or a domain or fragment thereof. 
Additionally, the polynucleotide may comprise a promoter, an intron, an enhancer 
region, a polyadenylation site, a translation initiation site, 5' or 3' untranslated 
regions, a reporter gene, a selectable marker, or the like. The polynucleotide can be 
single stranded or double stranded DNA or RNA. The polynucleotide optionally 
comprises modified bases or a modified backbone. The polynucleotide can be, e.g., 
genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a 
cloned DNA, a synthetic DNA or RNA, or the like. The polynucleotide can comprise 
a sequence in either sense or antisense orientations. 

A "recombinant polynucleotide" is a polynucleotide that is not in its native 
state, e.g., the polynucleotide comprises a nucleotide sequence not found in nature, or 
the polynucleotide is in a context other than that in which it is naturally found, e.g., 
separated from nucleotide sequences with which it typically is in proximity in nature, 
or adjacent (or contiguous with) nucleotide sequences with which it typically is not in 
proximity. For example, the sequence at issue can be cloned into a vector, or 
otherwise recombined with one or more additional nucleic acid. 

An "isolated polynucleotide" is a polynucleotide whether naturally occurring 
or recombinant, that is present outside the cell in which it is typically found in nature, 
whether purified or not. Optionally, an isolated polynucleotide is subject to one or 
more enrichment or purification procedures, e.g., cell lysis, extraction, centrifugation, 
precipitation, or the like. 

A "polypeptide" is an amino acid sequence comprising a plurality of 
consecutive polymerized amino acid residues e.g., at least about 15 consecutive 
polymerized amino acid residues, optionally at least about 30 consecutive 
polymerized amino acid residues, at least about 50 consecutive polymerized amino 
acid residues. In many instances, a polypeptide comprises a polymerized amino acid 
residue sequence that is a transcription factor or a domain or portion or fragment 
thereof. Additionally, the polypeptide may comprise a localization domain, 2) an 
activation domain, 3) a repression domain, 4) an oligomerization domain or 5) a 
DNA-binding domain, or the like. The polypeptide optionally comprises modified 
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amino acid residues, naturally occurring amino acid residues not encoded by a codon, 
non-naturally occurring amino acid residues. 

A "recombinant polypeptide" is a polypeptide produced by translation of a 
recombinant polynucleotide. A "synthetic polypeptide" is a polypeptide created by 
consecutive polymerization of isolated amino acid residues using methods well 
known in the art. An "isolated polypeptide," whether a naturally occurring or a 
recombinant polypeptide, is more enriched in (or out of) a cell than the polypeptide in 
its natural state in a wild type cell, e.g., more than about 5% enriched, more than 
about 10% enriched, or more than about 20%, or more than about 50%, or more, 
enriched, i.e., alternatively denoted: 105%, 110%, 120%, 1 50% or more, enriched 
relative to wild type standardized at 100%. Such an enrichment is not the result of a 
natural response of a wild type plant. Alternatively, or additionally, the isolated 
polypeptide is separated from other cellular components with which it is typically 
associated, e.g., by any of the various protein purification.methods herein. 

"Identity" or "similarity" refers to sequence similarity between two 
polynucleotide sequences or between two polypeptide sequences, with identity being 
a more strict comparison. The phrases "percent identity" and "% identity" refer to the 
percentage of sequence similarity found in a comparison of two or more 
polynucleotide sequences or two or more polypeptide sequences. Identity or 
similarity can be determined by comparing a position in each sequence that may be 
aligned for purposes of comparison. When a position in the compared sequence is 
occupied by the same nucleotide base or amino acid, then the molecules are identical 
at that position. A degree of similarity or identity between polynucleotide sequences 
is a function of the number of identical or matching nucleotides at positions shared by 
the polynucleotide sequences. A degree of identity of polypeptide sequences is a 
function of the number of identical amino acids at positions shared by the polypeptide 
sequences. A degree of homology or similarity of polypeptide sequences is a function 
of the number of amino acids, i.e., structurally related, at positions shared by the 
polypeptide sequences. 

"Altered" nucleic acid sequences encoding polypeptide include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting 
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in a polynucleotide encoding a polypeptide with at least one functional characteristic 
of the polypeptide. Included within this definition are polymorphisms that may or 
may not be readily detectable using a particular oligonucleotide probe of the 
polynucleotide encoding polypeptide, and improper or unexpected hybridization to 
allelic variants, with a locus other than the normal chromosomal locus for the 
polynucleotide sequence encoding polypeptide. The encoded polypeptide protein 
may also be "altered", and may contain deletions, insertions, or substitutions of amino 
acid residues that produce a silent change and result in a functionally equivalent 
polypeptide. Deliberate amino acid substitutions may be made on the basis of 
similarity in residue side chain chemistry, including, but not limited to, polarity, 
charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of 
the residues, as long as the biological activity of polypeptide is retained. For 
example, negatively charged amino acids may include aspartic acid and glutamic acid, 
positively charged amino acids may include lysine and arginine, and amino acids with 
uncharged polar head groups having similar hydrophilicity values may include 
leucine, isoleucine, and valine; glycine and alanine; asparagine and glutamine; serine 
and threonine; and phenylalanine and tyrosine. Alignments between different 
polypeptide sequences may be used to calculate "percentage sequence similarity". 

The term "plant" includes whole plants, shoot vegetative organs/structures 
(e.g., leaves, stems and tubers), roots, flowers and floral organs/structures (e.g., 
bracts, sepals, petals, stamens, carpels, anthers and ovules), seed (including embryo, 
endosperm, and seed coat) and fruit (the mature ovary), plant tissue (e.g., vascular 
tissue, ground tissue, and the like) and cells (e.g., guard cells, egg cells, and the like), 
and progeny of same. The class of plants that can be used in the method of the 
invention is generally as broad as the class of higher and lower plants amenable to 
transformation techniques, including angiosperms (monocotyledonous and 
dicotyledonous plants), gymnosperms, ferns, horsetails, psilophytes, lycophytes, 
bryophytes, and multicellular algae. (See for example, Figure 1, adapted from Daly et 
al. 2001 Plant Physiology 127:1328-1333; and see also Tudge, C, The Variety of 
Life, Oxford University Press, New York, 2000, pp. 547-606.) 

A 'transgenic plant" refers to a plant that contains genetic material not found 
in a wild type plant of the same species, variety or cultivar. The genetic material may 
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include a transgene, an insertional mutagenesis event (such as by transposon or T- 
DNA insertional mutagenesis), an activation tagging sequence, a mutated sequence, a 
homologous recombination event or a sequence modified by chimeraplasty. 
Typically, the foreign genetic material has been introduced into the plant by human 
manipulation, but any method can be used as one of skill in the art recognizes. 

A transgenic plant may contain an expression vector or cassette. The 
expression cassette typically comprises a polypeptide-encoding sequence operably 
linked (i.e., under regulatory control of) to appropriate inducible or constitutive 
regulatory sequences that allow for the expression of polypeptide. The expression 
cassette can be introduced into a plant by transformation or by breeding after 
transformation of a parent plant. A plant refers to a whole plant as well as to a plant 
part, such as seed, fruit, leaf, or root, plant tissue, plant cells or any other plant 
material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems 
that mimic biochemical or cellular components or processes in a cell. 

"Ectopic expression or altered expression" in reference to a polynucleotide 
indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is 
different from the expression pattern in a wild type plant or a reference plant of the 
same species. The pattern of expression may also be compared with a reference 
expression pattern in a wild type plant of the same species. For example, the 
polynucleotide or polypeptide is expressed in a cell or tissue type other than a cell or 
tissue type in which the sequence is expressed in the wild type plant, or by expression 
at a time other than at the time the sequence is expressed in the wild type plant, or by 
a response to different inducible agents, such as hormones or environmental signals, 
or at different expression levels (either higher or lower) compared with those found in 
a wild type plant. The term also refers to altered expression patterns that are produced 
by lowering the levels of expression to below the detection level or completely 
abolishing expression. The resulting expression pattern can be transient or stable, 
constitutive or inducible. In reference to a polypeptide, the term "ectopic expression 
or altered expression" further may relate to altered activity levels resulting from the 
interactions of the polypeptides with exogenous or endogenous modulators or from 
interactions with factors or as a result of the chemical modification of the 
polypeptides. 
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A "fragment" or "domain," with respect to a polypeptide, refers to a 
subsequence of the polypeptide. In some cases, the fragment or domain, is a 
subsequence of the polypeptide which performs at least one biological function of the 
intact polypeptide in substantially the same manner, or to a similar extent, as does the 
intact polypeptide. For example, a polypeptide fragment can comprise a recognizable 
structural motif or functional domain such as a DNA-binding site or domain that 
binds to a DNA promoter region, an activation domain, or a domain for protein- 
protein interactions. Fragments can vary in size from as few as 6 amino acids to the 
full length of the intact polypeptide, but are preferably at least about 30 amino acids in 
length and more preferably at least about 60 amino acids in length. In reference to a 
polynucleotide sequence, "a fragment 9 * refers to any subsequence of a polynucleotide, 
typically, of at least about 15 consecutive nucleotides, preferably at least about 30 
nucleotides, more preferably at least about 50 nucleotides, of any of the sequences 
provided herein. 

. The invention also encompasses production of DNA sequences that encode 
transcription factors and transcription factor derivatives, or fragments thereof, entirely 
by synthetic chemistry. After production, the synthetic sequence may be inserted into 
any of the many available expression vectors and cell systems using reagents well 
known in the art. Moreover, synthetic chemistry may be used to introduce mutations 
into a sequence encoding transcription factors or any fragment thereof. 

A "conserved domain", with respect to a polypeptide, refers to a domain 
within a transcription factor family which exhibits a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative 
substitutions, and preferably at least 80% sequence identity, and more preferably at 
least 85%, or at least about 86%, or at least about 87%, or at least about 88%, or at 
least about 90%, or at least about 95%, or at least about 98% amino acid residue 
sequence identity of a polypeptide of consecutive amino acid residues. A fragment or 
domain can be referred to as outside a consensus sequence or outside a consensus 
DNA-binding site that is known to exist or that exists for a particular transcription 
factor class, family, or sub-family. In this case, the fragment or domain will not 
include the exact amino acids of a consensus sequence or consensus DNA-binding 
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site of a transcription factor class, family or sub-family, or the exact amino acids of a 
particular transcription factor consensus sequence or consensus DNA-binding site. 
Furthermore, a particular fragment, region, or domain of a polypeptide, or a 
polynucleotide encoding a polypeptide, can be "outside a conserved domain" if all the 
amino acids of the fragment, region, or domain fall outside of a defined conserved 
domain(s) for a polypeptide or protein. The conserved domains for each of 
polypeptides of SEQ ID NOs:2 - 2N, where N = 2-561, are listed in Table 4 as 
described in Example VII. Also, many of the polypeptides of Table 4 have conserved 
domains specifically indicated by start and stop sites. A comparison of the regions of 
the polypeptides in SEQ ID NOs:2 - 2N, where N = 2-561, or of those in Table 4, 
allows one of skill in the art to identify conserved domain(s) for any of the 
polypeptides listed or referred to in this disclosure, including those in Table 4. 

A "trait" refers to a physiological, morphological, biochemical, or physical 
characteristic of a plant or particular plant material or cell. In some instances, this 
characteristic is visible to the human eye, such as seed or plant size, or can be 
measured by biochemical techniques, such as detecting the protein, starch, or oil 
content of seed or leaves, or by observation of a metabolic or physiological process, 
e.g. by measuring uptake of carbon dioxide, or by the observation of the expression 
level of a gene or genes, e.g., by employing Northern analysis, RT-PCR, microarray 
gene expression assays, or reporter gene expression systems, or by agricultural 
observations such as stress tolerance, yield, or pathogen tolerance. Any technique can 
be used to measure the amount of, comparative level of, or difference in any selected 
chemical compound or macromolecule in the transgenic plants, however. 

"Trait modification" refers to a detectable difference in a characteristic in a 
plant ectopically expressing a polynucleotide or polypeptide of the present invention 
relative to a plant not doing so, such as a wild type plant. In some cases, the trait 
modification can be evaluated quantitatively. For example, the trait modification can 
entail at least about a 2% increase or decrease in an observed trait (difference), at least 
a 5% difference, at least about a 10% difference, at least about a 20% difference, at 
least about a 30%, at least about a 50%, at least about a 70%, or at least about a 100%, 
or an even greater difference compared with a wild type plant. It is known that there 
can be a natural variation in the modified trait. Therefore, the trait modification 
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observed entails a change of the normal distribution of the trait in the plants compared 
with the distribution observed in wild type plant. 

I. Traits Which May Be Modified 

Trait modifications of particular interest include those to seed (such as embryo 
or endosperm), fruit, root, flower, leaf, stem, shoot, seedling or the like, including: 
enhanced tolerance to environmental conditions including freezing, chilling, heat, 
drought, water saturation, radiation and ozone; improved tolerance to microbial, 
fungal or viral diseases; improved tolerance to pest infestations, including nematodes, 
mollicutes, parasitic higher plants or the like; decreased herbicide sensitivity; 
improved tolerance of heavy metals or enhanced ability to take up heavy metals; 
improved growth under poor photoconditions (e.g., low light and/or short day length), 
or changes in expression levels of genes of interest. Other phenotype that can be 
modified relate to the production of plant metabolites, such as variations in the 
production of taxol, tocopherol, tocotrienol, sterols, phytosterols, vitamins, wax 
monomers, anti-oxidants, amino acids, lignins, cellulose, tannins, prenyllipids (such 
as chlorophylls and carotenoids), glucosinolates, and terpenoids, enhanced or 
compositionally altered protein or oil production (especially in seeds), or modified 
sugar (insoluble or soluble) and/or starch composition. Physical plant characteristics 
that can be modified include cell development (such as the number of trichomes), fruit 
and seed size and number, yields of plant parts such as stems, leaves, inflorescences, 
and roots, the stability of the seeds during storage, characteristics of the seed pod 
(e.g., susceptibility to shattering), root hair length and quantity, internode distances, or 
the quality of seed coat. Plant growth characteristics that can be modified include 
growth rate, germination rate of seeds, vigor of plants and seedlings, leaf and flower 
senescence, male sterility, apomixis, flowering time, flower abscission, rate of 
nitrogen uptake, osmotic sensitivity to soluble sugar concentrations, biomass or 
transpiration characteristics, as well as plant architecture characteristics such as apical 
dominance, branching patterns, number of organs, organ identity, organ shape or size. 

II. Transcription Factors Modify Expression Of Endogenous Genes 

Expression of genes which encode transcription factors that modify expression 
of endogenous genes, polynucleotides, and proteins are well known in the art. In 
addition, transgenic plants comprising isolated polynucleotides encoding transcription 
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factors may also modify expression of endogenous genes, polynucleotides, and 
proteins. Examples include Peng et al. (1997, Genes and Development 1 1 :3194- 
3205) and Peng et al. (1999, Nature, 400:256-261). In addition, many others have 
demonstrated that an Arabidopsis transcription factor expressed in an exogenous plant 
species elicits the same or very similar phenotypic response. See, for example, Fu et 
al. (2001, Plant Cell 13:1791-1802); Nandi et al. (2000, Curr. Biol. 10:215-218); 
Coupland (1995, Nature 377:482-483); and Weigel and Nilsson (1995, Nature 
377:482-500). 

In another example, Mandel et al. (1992, Cell 71-133-143) and Suzuki et al. 
(2001, Plant J. 28:409-418) teach that a transcription factor expressed in another plant 
species elicits the same or very similar phenotypic response of the endogenous 
sequence, as often predicted in earlier studies of Arabidopsis transcription factors in 
Arabidopsis (see Mandel et al., 1992, supra; Suzuki et al., 2001, supra). 

Other examples include Miiller et al. (2001, Plant J. 28:169-179); Kim et al. 
(2001, Plant J. 25:247-259); Kyozuka and Shimamoto (2002, Plant Cell Physiol. 
43:130-135); Boss and Thomas (2002, Nature, 416:847-850); He et al. (2000, 
Transgenic Res., 9:223-227); and Robson et al. (2001, Plant J. 28:619-631). 

In yet another example, Gilmour et al. (1998, Plant J. 16:433-442) teach an 
Arabidopsis AP2 transcription factor, CBF1, which, when overexpressed in transgenic 
plants, increases plant freezing tolerance. Jaglo et al (2001, Plant Physiol. 127:910- 
917) further identified sequences in Brassica napus which encode CBF-like genes and 
that transcripts for these genes accumulated rapidly in response to low temperature. 
Transcripts encoding CBF-like proteins were also found to accumulate rapidly in 
response to low temperature in wheat, as well as in tomato. An alignment of the CBF 
proteins from Arabidopsis, B. napus, wheat, rye, and tomato revealed the presence of 
conserved amino acid sequences, PKK/RPAGRxKFxETRHP and DSAWR, that 
bracket the AP2/EREBP DNA binding domains of the proteins and distinguish them 
from other members of the AP2/EREBP protein family. (See Jaglo et al., supra.) 

III. Polypeptides and Polynucleotides of the Invention 
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The present invention provides, among other things, transcription factors 
(TFs), and transcription factor homologue polypeptides, and isolated or recombinant 
polynucleotides encoding the polypeptides, or novel variant polypeptides or 
polynucleotides encoding novel variants of transcription factors derived from the 
specific sequences provided here. These polypeptides and polynucleotides may be 
employed to modify a plant's characteristic. 

Exemplary polynucleotides encoding the polypeptides of the invention were 
identified in the Arabidopsis thaliana GenBank database using publicly available 
sequence analysis programs and parameters. Sequences initially identified were then 
further characterized to identify sequences comprising specified sequence strings 
corresponding to sequence motifs present in families of known transcription factors. 
In addition, further exemplary polynucleotides encoding the polypeptides of the 
invention were identified in the plant GenBank database using publicly available 
sequence analysis programs and parameters. Sequences initially identified were then 
further characterized to identify sequences comprising specified sequence strings 
corresponding to sequence motifs present in families of known transcription factors. 
Polynucleotide sequences meeting such criteria were confirmed as transcription 
factors. 

Additional polynucleotides of the invention were identified by screening 
Arabidopsis thaliana and/or other plant cDNA libraries with probes corresponding to 
known transcription factors under low stringency hybridization conditions. 
Additional sequences, including full length coding sequences were subsequently 
recovered by the rapid amplification of cDNA ends (RACE) procedure, using a 
commercially available kit according to the manufacturer's instructions. Where 
necessary, multiple rounds of RACE are performed to isolate 5' and 3' ends. The full 
length cDNA was then recovered by a routine end-to-end polymerase chain reaction 
(PGR) using primers specific to the isolated 5' and 3' ends. Exemplary sequences are 
provided in the Sequence Listing. 

The polynucleotides of the invention can be or were ectopically expressed in 
overexpressor or knockout plants and the changes in the characteristic(s) or trait(s) of 
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the plants observed. Therefore, the polynucleotides and polypeptides can be 
employed to improve the characteristics of plants. 



The polynucleotides of the invention can be or were ectopically expressed in 
overexpressor plant cells and the changes in the expression levels of a number of 
genes, polynucleotides, and/or proteins of the plant cells observed. Therefore, the 
polynucleotides and polypeptides can be employed to change expression levels of a 
genes, polynucleotides, and/or proteins of plants. 

IV. Producing Polypeptides 

The polynucleotides of the invention include sequences that encode 
transcription factors and transcription factor homologue polypeptides and sequences 
complementary thereto, as well as unique fragments of coding sequence, or sequence 
complementary thereto. Such polynucleotides can be, e.g., DNA or RNA, e.g., 
mRNA, cRNA, synthetic RNA, genomic DNA, cDNA synthetic DNA, 
oligonucleotides, etc. The polynucleotides are either double-stranded or single- 
stranded, and include either, or both sense (i.e., coding) sequences and antisense (i.e., 
non-coding, complementary) sequences. The polynucleotides include the coding 
sequence of a transcription factor, or transcription factor homologue polypeptide, in 
isolation, in combination with additional coding sequences (e.g., a purification tag, a 
localization signal, as a fusion-protein, as a pre-protein, or the like), in combination 
with non-coding sequences (e.g., introns or inteins, regulatory elements such as 
promoters, enhancers, terminators, and the like), and/or in a vector or host 
environment in which the polynucleotide encoding a transcription factor or 
transcription factor homologue polypeptide is an endogenous or exogenous gene. 

A variety of methods exist for producing the polynucleotides of the invention. 
Procedures for identifying and isolating DNA clones are well known to those of skill 
in the art, and are described in, e.g., Berger and Kimmel, Guide to Molecular Cloning 
Techniques, Methods in Enzvmology volume 152 Academic Press, Inc., San Diego, 
CA ("Berger"); Sambrook et al., Molecular Cloning - A Laboratory Manual (2nd 
Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989 
("Sambrook") and Current Protocols in Molecular Biology. F. M. Ausubel et al., eds., 
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Current Protocols, a joint venture between Greene Publishing Associates, Inc. and 
John Wiley & Sons, Inc., (supplemented through 2000) ("Ausubel"). 

Alternatively, polynucleotides of the invention, can be produced by a variety 
of in vitro amplification methods adapted to the present invention by appropriate 
selection of specific or degenerate primers. Examples of protocols sufficient to direct 
persons of skill through in vitro amplification methods, including the polymerase 
chain reaction (PCR) the ligase chain reaction (LCR), Qbeta-replicase amplification 
and other RNA polymerase mediated techniques (e.g., NASBA), e.g., for the 
production of the homologous nucleic acids of the invention are found in Berger 
(supra), Sambrook (supra), and Ausubel (supra), as well as Mullis et al., (1987) PCR 
Protocols A Guide to Methods and Applications Qhnis et al. eds) Academic Press Inc. 
San Diego, CA (1990) (Innis). Improved methods for cloning in vitro amplified 
nucleic acids are described in Wallace et al., U.S. Pat. No. 5,426,039. Improved 
methods for amplifying large nucleic acids by PCR are summarized in Cheng et al. 
(1994) Nature 369: 684-685 and the references cited therein, in which PCR amplicons 
of up to 40kb are generated. One of skill will appreciate that essentially any RNA can 
be converted into a double stranded DNA suitable for restriction digestion, PCR 
expansion and sequencing using reverse transcriptase and a polymerase. See, e.g., 
Ausubel, Sambrook and Berger, all supra. 

Alternatively, polynucleotides and oligonucleotides of the invention can be 
assembled from fragments produced by solid-phase synthesis methods. Typically, 
fragments of up to approximately 100 bases are individually synthesized and then 
enzymatically or chemically ligated to produce a desired sequence, e.g., a 
polynucleotide encoding all or part of a transcription factor. For example, chemical 
synthesis using the phosphoramidite method is described, e.g., by Beaucage et al. 
(1981) Tetrahedron Letters 22:1859-1869; and Matthes et al. (1984) EMBO J. 3:801- 
805. According to such methods, oligonucleotides are synthesized, purified, annealed 
to their complementary strand, ligated and then optionally cloned into suitable 
vectors. And if so desired, the polynucleotides and polypeptides of the invention can 
be custom ordered from any of a number of commercial suppliers. 

V. Homologous Sequences 
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Sequences homologous, i.e., that share significant sequence identity or 
similarity, to those provided in the Sequence Listing, derived from Arabidopsis 
thaliana or from other plants of choice are also an aspect of the invention. 
Homologous sequences can be derived from any plant including monocots and dicots 
and in particular agriculturally important plant species, including but not limited to, 
crops such as soybean, wheat, com, potato, cotton, rice, rape, oilseed rape (including 
canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as 
banana, blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, 
cauliflower, coffee, cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, 
onion, papaya, peas, peppers, pineapple, pumpkin, spinach, squash, sweet com, 
tobacco, tomato, watermelon, rosaceous fruits (such as apple, peach, pear, cherry and 
plum) and vegetable brassicas (such as broccoli, cabbage, cauliflower, Brussels 
sprouts, and kohlrabi). Other crops, fruits and vegetables whose phenotype can be 
changed include barley, rye, millet, sorghum, currant, avocado, citrus fruits such as 
oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the 
walnut and peanut, endive, leek, roots, such as arrowroot, beet, cassava, turnip, radish, 
yam, and sweet potato, and beans. The homologous sequences may also be derived 
from woody species, such pine, poplar and eucalyptus, or mint or other labiates. 

OrtholoRS And Paralogs 

Several different methods are known by those of skill in the art for identifying 
and defining these functionally homologous sequences. Three general methods for 
defining paralogs and orthologs are described; a paralog or ortholog or homolog may 
be identified by one or more of the methods described below. 

Orthologs and paralogs are evolutionarily related genes that have similar 
sequence and similar functions. Orthologs are structurally related genes in different 
species that are derived from a speciation event. Paralogs are structurally related 
genes within a single species that are derived by a duplication event. 

Within a single plant species, gene duplication may cause two copies of a 
particular gene, giving rise to two or more genes with similar sequence and similar 
function known as paralogs. A paralog is therefore a similar gene with a similar 
function within the same species. Paralogs typically cluster together or in the same 
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clade (a group of similar genes) when a gene family phylogeny is analyzed using 
programs such as CLUSTAL (Thompson et al. (1994) Nucleic Acids Res. 22:4673- 
4680; Higgins et al. (1996) Methods Enzymol. 266 383-402). Groups of similar 
genes can also be identified with pair-wise BLAST analysis (Feng and Doolittle 
(1987) J. Mol. Evol. 25:351-360). For example, a clade of very similar MADS 
domain transcription factors from Arabidopsis all share a common function in 
flowering time (Ratcliffe et al. (2001) Plant Physiol. 126:122-132), and a group of 
very similar AP2 domain transcription factors from Arabidopsis are involved in 
tolerance of plants to freezing (Gilmour et al. (1998) Plant J. 16:433-442). Analysis 
of groups of similar genes with similar function that fall within one clade can yield 
sub-sequences that are particular to the clade. These sub-sequences, known as 
consensus sequences, can not only be used to define the sequences within each clade, 
but define the functions of these genes; genes within a clade may contain paralogous 
or orthologous sequences that share the same function. (See also, for example, Mount, 
D.W. (2001) Bioinformatics: Sequence and Genome Analysis Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York page 543.) 

Speciation, the production of new species from a parental species, can also 
give rise to two or more genes with similar sequence and similar function. These 
genes, termed orthologs, often have an identical function within their host plants and 
are often interchangeable between species without losing function. Because plants 
have common ancestors, many genes in any plant species will have a corresponding 
orthologous gene in another plant species. Once a phylogenic tree for a gene family 
of one species has been constructed using a program such as CLUSTAL (Thompson 
et al. (1994) Nucleic Acids Res. 22:4673-4680; Higgins et al. (1996) Methods 
Enzymol. 266:383-402), potential orthologous sequences can placed into the 
phylogenetic tree and its relationship to genes from the species of interest can be 
determined. Once the ortholog pair has been identified, the function of the test 
ortholog can be determined by determining the function of the reference ortholog. 

Transcription factors that are homologous to the listed sequences will typically 
share at least about 30% amino acid sequence identity, or at least about 30% amino 
acid sequence identity outside of a known consensus sequence or consensus DNA- 
binding site. More closely related transcription factors can share at least about 50%, 
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about 60%, about 65%, about 70%, about 75% or about 80% or about 90% or about 
95% or about 98% or more sequence identity with the listed sequences, or with the 
listed sequences but excluding or outside a known consensus sequence or consensus 
DNA-binding site, or with the listed sequences excluding one or all conserved 
domain. Factors that are most closely related to the listed sequences share, e.g., at 
least about 85%, about 90% or about 95% or more % sequence identity to the listed 
sequences, or to the listed sequences but excluding or outside a known consensus 
sequence or consensus DNA-binding site or outside one or all conserved domain. At 
the nucleotide level, the sequences will typically share at least about 40% nucleotide 
sequence identity, preferably at least about 50%, about 60%, about 70% or about 80% 
sequence identity, and more preferably about 85%, about 90%, about 95% or about 
97% or more sequence identity to one or more of the listed sequences, or to a listed 
sequence but excluding or outside a known consensus sequence or consensus DNA- 
binding site, or outside one or all conserved domain. The degeneracy of the genetic 
code enables major variations in the nucleotide sequence of a polynucleotide while 
maintaining the amino acid sequence of the encoded protein. Conserved domains 
within a transcription factor family may exhibit a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative 
substitutions, and preferably at least 80% sequence identity, and more preferably at 
least 85%, or at least about 86%, or at least about 87%, or at least about 88%, or at 
least about 90%, or at least about 95%, or at least about 98% sequence identity. 
Transcription factors that are homologous to the listed sequences should share at least 
30%, or at least about 60%, or at least about 75%, or at least about 80%, or at least 
about 90%, or at least about 95% amino acid sequence identity over the entire length 
of the polypeptide or the homolog. In addition, transcription factors that are 
homologous to the listed sequences should share at least 30%, or at least about 60%, 
or at least about 75%, or at least about 80%, or at least about 90%, or at least about 
95% amino acid sequence similarity over the entire length of the polypeptide or the 
homolog. 

Percent identity can be determined electronically, e.g., by using the 
MEGALIGN program (DNASTAR, Inc. Madison, Wis.). The MEGALIGN program 
can create alignments between two or more sequences according to different methods, 
e.g., the clustal method. (See, e.g., Higgins, D. G. and P. M. Sharp (1988) Gene 
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73:237-244.) The clustal algorithm groups sequences into clusters by examining the 
distances between all pairs. The clusters are aligned pairwise and then in groups. 
Other alignment algorithms or programs may be used, including FASTA, BLAST, or 
ENTREZ, FASTA and BLAST. These are available as a part of the GCG sequence 
analysis package (University of Wisconsin, Madison, Wis.), and can be used with or 
without default settings. ENTREZ is available through the National Center for 
Biotechnology Information. In one embodiment, the percent identity of two 
sequences can be determined by the GCG program with a gap weight of 1, e.g., each 
amino acid gap is weighted as if it were a single amino acid or nucleotide mismatch 
between the two sequences (see USPN 6,262,333). 

Other techniques for alignment are described in Methods in Enzymology, vol. 
266: Computer Methods for Macromolecular Sequence Analysis (1996), ed. Doolittle, 
Academic Press, Inc., San Diego, Calif., USA. Preferably, an alignment program that 
permits gaps in the sequence is utilized to align the sequences. The Smith- Waterman 
is one type of algorithm that permits gaps in sequence alignments. See Methods Mol. 
Biol. 70: 173-187 (1997). Also, the GAP program using the Needleman and Wunsch 
alignment method can be utilized to align sequences. An alternative search strategy 
uses MPSRCH software, which runs on a MASPAR computer. MPSRCH uses a 
Smith- Waterman algorithm to score sequences on a massively parallel computer. 
This approach improves ability to pick up distantly related matches, and is especially 
tolerant of small gaps and nucleotide sequence errors. Nucleic acid-encoded amino 
acid sequences can be used to search both protein and DNA databases. 

The percentage similarity between two polypeptide sequences, e.g., sequence 
A and sequence B, is calculated by dividing the length of sequence A, minus the 
number of gap residues in sequence A, minus the number of gap residues in sequence 
B, into the sum of the residue matches between sequence A and sequence B, times 
one hundred. Gaps of low or of no similarity between the two amino acid sequences 
are not included in determining percentage similarity. Percent identity between 
polynucleotide sequences can also be counted or calculated by other methods known 
in the art, e.g., the Jotun Hein method. (See, e.g., Hein, J. (1990) Methods Enzymol. 
183:626-645.) Identity between sequences can also be determined by other methods 
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known in the art, e.g., by varying hybridization conditions (see US Patent Application 
No. 20010010913). 

Thus, the invention provides methods for identifying a sequence similar or 
paralogous or orthologous or homologous to one or more polynucleotides as noted 
herein, or one or more target polypeptides encoded by the polynucleotides, or 
otherwise noted herein and may include linking or associating a given plant 
phenotype or gene function with a sequence. In the methods, a sequence database is 
provided (locally or across an inter or intra net) and a query is made against the 
sequence database using the relevant sequences herein and associated plant 
phenotypes or gene functions. 

In addition, one or more polynucleotide sequences or one or more 
polypeptides encoded by the polynucleotide sequences may be used to search against 
a BLOCKS (Bairoch et al. (1997) Nucleic Acids Res. 25:217-221), PFAM, and other 
databases which contain previously identified and annotated motifs, sequences and 
gene functions. Methods that search for primary sequence patterns with secondary 
structure gap penalties (Smith et al. (1992) Protein Engineering 5:35-51) as well as 
algorithms such as Basic Local Alignment Search Tool (BLAST; Altschul, S. F. 
(1993) J. Mol. Evol 36:290-300; Altschul et al. (1990) supra), BLOCKS (Henikoff, 
S. and Henikoff, G. J. (1991) Nucleic Acids Research 19:6565-6572), Hidden Markov 
Models (HMM; Eddy, S. R. (1996) Cur. Opin. Str. Biol. 6:361-365; Sonnhammer et 
al. (1997) Proteins 28:405-420), and the like, can be used to manipulate and analyze 
polynucleotide and polypeptide sequences encoded by polynucleotides. These 
databases, algorithms and other methods are well known in the art and are described 
in Ausubel et al. (1997; Short Protocols in Molecular Biology, John Wiley & Sons, 
New York N.Y., unit 7.7) and in Meyers, R. A. (1995; Molecular Biology and 
Biotechnology, Wiley VCH, New York N.Y., p 856-853). 

Furthermore, methods using manual alignment of sequences similar or 
homologous to one or more polynucleotide sequences or one or more polypeptides 
encoded by the polynucleotide sequences may be used to identify regions of similarity 
and conserved domains. Such manual methods are well-known of those of skill in the 
art and can include, for example, comparisons of tertiary structure between a 
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polypeptide sequence encoded by a polynucleotide which comprises a known function 
with a polypeptide sequence encoded by a polynucleotide sequence which has a 
function not yet determined. Such examples of tertiary structure may comprise 
predicted alpha helices, beta-sheets, amphipathic helices, leucine zipper motifs, zinc 
finger motifs, proline-rich regions, cysteine repeat motifs, and the like. 

VI. Identifying Polynucleotides or Nucleic Acids by Hybridization 

Polynucleotides homologous to the sequences illustrated in the Sequence 
Listing and tables can be identified, e.g., by hybridization to each other under 
stringent or under highly stringent conditions. Single stranded polynucleotides 
hybridize when they associate based on a variety of well characterized physical- 
chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the 
like. The stringency of a hybridization reflects the degree of sequence identity of the 
nucleic acids involved, such that the higher the stringency, the more similar are the 
two polynucleotide strands. Stringency is influenced by a variety of factors, including 
temperature, salt concentration and composition, organic and non-organic additives, 
solvents, etc. present in both the hybridization and wash solutions and incubations 
(and number thereof), as described in more detail in the references cited above. 
Encompassed by the invention are polynucleotide sequences that are capable of 
hybridizing to the claimed polynucleotide sequences, and, in particular, to those 
shown in SEQ ID NOs: 860; 802; 240; 274; 558; 24; 1 120; 44; 460; 286; 120; 130; 
134; 698; 832; 580; 612; 48, and fragments thereof under various conditions of 
stringency. (See, e.g., Wahl, G. M. and S. L. Berger (1987) Methods Enzymol. 
152:399-407; Kimmel, A. R. (1987) Methods Enzymol. 152:507-511.) Estimates of 
homology are provided by either DNA-DNA or DNA-RNA hybridization under 
conditions of stringency as is well understood by those skilled in the art (Hames and 
Higgins, Eds. (1985) Nucleic Acid Hybridisation, IRL Press, Oxford, U.K.). 
Stringency conditions can be adjusted to screen for moderately similar fragments, 
such as homologous sequences from distantly related organisms, to highly similar 
fragments, such as genes that duplicate functional enzymes from closely related 
organisms. Post-hybridization washes determine stringency conditions. 

In addition to the nucleotide sequences listed in Tables 4 and 5, full length 
cDNA, orthologs, paralogs and homologs of the present nucleotide sequences may be 
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identified and isolated using well known methods. The cDNA libraries orthologs, 
paralogs and homologs of the present nucleotide sequences may be screened using 
hybridization methods to determine their utility as hybridization target or 
amplification probes. 

An example of stringent hybridization conditions for hybridization of 
complementary nucleic acids which have more than 100 complementary residues on a 
filter in a Southern or northern blot is about 5°C to 20°C lower than the thermal 
melting point (T m ) for the specific sequence at a defined ionic strength and pH. The 
T m is the temperature (under defined ionic strength and pH) at which 50% of the 
target sequence hybridizes to a perfectly matched probe. Nucleic acid molecules that 
-hybridize under stringent conditions will typically hybridize to a probe based on either 
the entire cDNA or selected portions, e.g., to a unique subsequence, of the cDNA 
under wash conditions of 0.2x SSC to 2.0 x SSC, 0.1% SDS at 50-65° C. For 
example, high stringency is about 0.2 x SSC, 0.1% SDS at 65° C. Ultra-high 
stringency will be the same conditions except the wash temperature is raised about 3 
to about 5° C, and ultra-ultra-high stringency will be the same conditions except the 
wash temperature is raised about 6 to about 9° C. For identification of less closely 
related homologues washes can be performed at a lower temperature, e.g., 50° C. In 
general, stringency is increased by raising the wash temperature and/or decreasing the 
concentration of SSC, as known in the art. 

In another example, stringent salt concentration will ordinarily be less than 
about 750 mM NaCl and 75 mM trisodium citrate, preferably less than about 500 mM 
NaCl and 50 mM trisodium citrate, and most preferably less than about 250 mM NaCl 
and 25 mM trisodium citrate. Low stringency hybridization can be obtained in the 
absence of organic solvent, e.g., formamide, while high stringency hybridization can 
be obtained in the presence of at least about 35% formamide, and most preferably at 
least about 50% formamide. Stringent temperature conditions will ordinarily include 
temperatures of at least about 30° C, more preferably of at least about 37° C, and most 
preferably of at least about 42° C. Varying additional parameters, such as 
hybridization time, the concentration of detergent, e.g., sodium dodecyl sulfate (SDS), 
and the inclusion or exclusion of carrier DNA, are well known to those skilled in the 
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art. Various levels of stringency are accomplished by combining these various 
conditions as needed. In a preferred embodiment, hybridization will occur at 30° C in 
750 mM NaCl, 75 mM trisodium citrate, and 1% SDS. In a more preferred 
embodiment, hybridization will occur at 37° C in 500 mM NaCl, 50 mM trisodium 
citrate, 1% SDS, 35% formamide, and 100 |ig/ml denatured salmon sperm DNA 
(ssDNA). In a most preferred embodiment, hybridization will occur at 42° C in 250 
mM NaCl, 25 mM trisodium citrate, 1% SDS, 50% formamide, and 200 jig/ml 
ssDNA. Useful variations on these conditions will be readily apparent to those skilled 
in the art. 

The washing steps that follow hybridization can also vary in stringency. Wash 
stringency conditions can be defined by salt concentration and by temperature. As 
above, wash stringency can be increased by decreasing salt concentration or by 
increasing temperature. For example, stringent salt concentration for the wash steps 
will preferably be less than about 30 mM NaCl and 3 mM trisodium citrate, and most 
preferably less than about 1 5 mM NaCl and 1 .5 mM trisodium citrate. Stringent 
temperature conditions for the wash steps will ordinarily include temperature of at 
least about 25° C, more preferably of at least about 42° C. Another preferred set of ■ 
highly stringent conditions uses two final washes in 0.1X SSC, 0.1% SDS at 65° C. 
The most preferred high stringency washes are of at least about 68° C. For example, 1 
in a preferred embodiment, wash steps will occur at 25° C in 30 mM NaCl, 3 mM 
trisodium citrate, and 0.1% SDS. In a more preferred embodiment, wash steps will . 
occur at 42° C in 15 mM NaCl, 1.5 mM trisodium citrate, and 0.1% SDS. In a most 
preferred embodiment, the wash steps will occur at 68° C in 15 mM NaCl, 1.5 mM 
trisodium citrate, and 0.1% SDS. Additional variations on these conditions will be 
readily apparent to those skilled in the art (see U.S. Patent Application No. 
20010010913). 

As another example, stringent conditions can be selected such that an 
oligonucleotide that is perfectly complementary to the coding oligonucleotide 
hybridizes to the coding oligonucleotide with at least about a 5-1 Ox higher signal to 
noise ratio than the ratio for hybridization of the perfectly complementary 
oligonucleotide to a nucleic acid encoding a transcription factor known as of the filing 
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date of the application. Conditions can be selected such that a higher signal to noise 
ratio is observed in the particular assay which is used, e.g., about 15x, 25x, 35x, 50x 
or more. Accordingly, the subject nucleic acid hybridizes to the unique coding 
oligonucleotide with at least a 2x higher signal to noise ratio as compared to 
hybridization of the coding oligonucleotide to a nucleic acid encoding known 
polypeptide. Again, higher signal to noise ratios can be selected, e.g., about 5x, lOx, 
25x, 35x, 50x or more. The particular signal will depend on the label used in the 
relevant assay, e.g., a fluorescent label, a colorimetric label, a radioactive label, or the 
like. 

Alternatively, transcription factor homolog polypeptides can be obtained by 
screening an expression library using antibodies specific for one or more transcription 
factors. With the provision herein of the disclosed transcription factor, and 
transcription factor homologue nucleic acid sequences, the encoded polypeptide(s) 
can be expressed and purified in a heterologous expression system (e.g., E. coli) and 
used to raise antibodies (monoclonal or polyclonal) specific for the polypeptide(s) in 
question. Antibodies can also be raised against synthetic peptides derived from 
transcription factor, or transcription factor homologue, amino acid sequences. 
Methods of raising antibodies are well known in the art and are described in Harlow 
and Lane (1988) Antibodies: A Laboratory Manual , Cold Spring Harbor Laboratory, 
New York. Such antibodies can then be used to screen an expression library 
produced from the plant from which it is desired to clone additional transcription 
factor homologues, using the methods described above. The selected cDNAs can be 
confirmed by sequencing and enzymatic activity. 

VII. Sequence Variations 

It will readily be appreciated by those of skill in the art, that any of a variety of 
polynucleotide sequences are capable of encoding the transcription factors and 
transcription factor homologue polypeptides of the invention. Due to the degeneracy 
of the genetic code, many different polynucleotides can encode identical and/or 
substantially similar polypeptides in addition to those sequences illustrated in the 
Sequence Listing. Nucleic acids having a sequence that differs from the sequences 
shown in the Sequence Listing, or complementary sequences, that encode functionally 
equivalent peptides (i.e., peptides having some degree of equivalent or similar 
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biological activity) but differ in sequence from the sequence shown in the sequence 
listing due to degeneracy in the genetic code, are also within the scope of the 
invention. 

Altered polynucleotide sequences encoding polypeptides include those 
sequences with deletions, insertions, or substitutions of different nucleotides, resulting 
in a polynucleotide encoding a polypeptide with at least one functional characteristic 
of the instant polypeptides. Included within this definition are polymorphisms which 
may or may not be readily detectable using a particular oligonucleotide probe of the 
polynucleotide encoding the instant polypeptides, and improper or unexpected 
hybridization to allelic variants, with a locus other than the normal chromosomal 
locus for the polynucleotide sequence encoding the instant polypeptides. 

Allelic variant refers to any of two or more alternative forms of a gene 
occupying the same chromosomal locus. Allelic variation arises naturally through 
mutation, and may result in phenotypic polymorphism within populations. Gene 
mutations can be silent (i.e., no change in the encoded polypeptide) or may encode 
polypeptides having altered amino acid sequence. The term allelic variant is also used 
herein to denote a protein encoded by an allelic variant of a gene. Splice variant refers 
to alternative forms of RNA transcribed from a gene. Splice variation arises naturally 
through use of alternative splicing sites within a transcribed RNA molecule, or less 
commonly between separately transcribed RNA molecules, and may result in several 
mRNAs transcribed from the same gene. Splice variants may encode polypeptides 
having altered amino acid sequence. The term splice variant is also used herein to 
denote a protein encoded by a splice variant of an mRNA transcribed from a gene. 

Those skilled in the art would recognize that the polypeptide sequence G681, 
SEQ ID NO: 580, represents a single transcription factor; allelic variation and 
alternative splicing may be expected to occur. Allelic variants of the polypeptide 
sequence of SEQ ID NO: 579 can be cloned by probing cDNA or genomic libraries 
from different individual organisms according to standard procedures. Allelic 
variants of the DNA sequence shown in SEQ ID NO: 579, including those containing 
silent mutations and those in which mutations result in amino acid sequence changes, 
are within the scope of the present invention, as are proteins which are allelic variants 
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of SEQ ID NO: 580. cDNAs generated from alternatively spliced mRNAs, which 
retain the properties of the transcription factor are included within the scope of the 
present invention, as are polypeptides encoded by such cDNAs and mRNAs. Allelic 
variants and splice variants of these sequences can be cloned by probing cDNA or 
genomic libraries from different individual organisms or tissues according to standard 
procedures known in the art (see USPN 6,388,064). 

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA, TCC, 
TCG, and TCT all encode the same amino acid: serine. Accordingly, at each position 
in the sequence where there is a codon encoding serine, any of the above trinucleotide 
sequences can be used without altering the encoded polypeptide. 
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Table 1 



Amino acid 


Possible Codons 


Alanine 


Ala 


A 
A 


CrCA 


ppp 

VJLL 




VJLU 






Cysteine 


Cys 


c 


TGC 

X VJV^ 


TGT 










Aspartic acid Asp 


U 


HAP 
IjAL 


p T AT 










Glutamic acid Glu 


xi 


PA A 


GAG 










Phenylalanine Phe 


r 




TTT 
ill 












G1v 


p 


pp A 


ppp 
vjvjL 


ppp 


PPT 






Histidine 


His 


TT 

xl 


LAL 


PAT 
LAI 










Isoleucine 


He 


T 
1 


ATA 


ATP 
AIL 


ATT 








Lysine 


Lys 


Jv 


AAA 
AAA 


A A p 
AAu 










Leucine 


Leu 


T 

L, 


TT A 
1 1 A 


1 1 VJ 


PTA 
LI A 


PTP 
L1L 


pTp 
L 1 vj 


PTT 
LI 1 


Methionine 


Met 


JYL 


atp 

AlLr 












Asparagine 


Asn 


"NT 

IN 


A AT 


A AT 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


CCT 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


Serine 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






Valine 


Val 


V 


GTA 


GTC 


GTG 


GTT 






Tryptophan 


Tip 


w 


TGG 












Tyrosine 


Tyr 


Y 


TAC 


TAT 











Sequence alterations that do not change the amino acid sequence encoded by 
the polynucleotide are termed "silent" variations. With the exception of the codons 
ATG and TGG, encoding methionine and tryptophan, respectively, any of the possible 
codons for the same amino acid can be substituted by a variety of techniques, e.g., 
site-directed mutagenesis, available in the art. Accordingly, any and all such 
variations of a sequence selected from the above table are a feature of the invention. 
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In addition to silent variations, other conservative variations that alter one, or a 
few amino acids in the encoded polypeptide, can be made without altering the 
function of the polypeptide, these conservative variants are, likewise, a feature of the 
invention. 

For example, substitutions, deletions and insertions introduced into the 
sequences provided in the Sequence Listing are also envisioned by the invention. 
Such sequence modifications can be engineered into a sequence by site-directed 
mutagenesis (Wu (ed.) Meth. EnzvmoL (1993) vol. 217, Academic Press) or the other 
methods noted below. Amino acid substitutions are typically of single residues; 
insertions usually will be on the order of about from 1 to 10 amino acid residues; and 
deletions will range about from 1 to 30 residues. In preferred embodiments, deletions 
or insertions are made in adjacent pairs, e.g., a deletion of two residues or insertion of 
two residues. Substitutions, deletions, insertions or any combination thereof can be 
combined to arrive at a sequence. The mutations that are made in the polynucleotide 
encoding the transcription factor should not place the sequence out of reading frame 
and should not create complementary regions that could produce secondary mRNA 
structure. Preferably, the polypeptide encoded by the DNA performs the desired 
function. 

Conservative substitutions are those in which at least one residue in the amino 
acid sequence has been removed and a different residue inserted in its place. Such 
substitutions generally are made in accordance with the Table 2 when it is desired to 
maintain the activity of the protein. Table 2 shows amino acids which can be 
substituted for an amino acid in a protein and which are typically regarded as 
conservative substitutions. 
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Table 2 



Residue 


V/Ulibci v all v c 




oUDStitutions 


A 1 _ 

Ala 


OCI 


Arg 


Lys 


Asn 


vjuI, nlS 


Asp 


Glu 


Gin 


A 

Asn 


Cys 


Ser 


Glu 


Asp 


Gly 


Pro 


His 


Asn; Gin 


He 


Leu, Val 


Leu 


lie; Val 


Lys 


Arg, vjin 


Met 


Leu, lie 


Phe 


Met; Leu; Tyr 


Ser 


Thr; Gly 


Thr 


Ser; Val 


Trp 


Tyr 


Tyr 


Trp; Phe 


Val 


lie; Leu 



Similar substitutions are those in which at least one residue in the amino acid 
sequence has been removed and a different residue inserted in its place. Such 
substitutions generally are made in accordance with the Table 3 when it is desired to 
maintain the activity of the protein. Table 3 shows amino acids which can be 
substituted for an amino acid in a protein and which are typically regarded as 
structural and functional substitutions. For example, a residue in column 1 of Table 3 
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may be substituted with residue in column 2; in addition, a residue in column 2 of 
Table 3 may be substituted with the residue of column 1 . 



Table 3 


Residue 


Similar Substitutions 


Ala 


Ser; Thr; Gly; Val; Leu; lie 


Arg 


▼ TT' 

Lys; His; Gly 


Asn 


Gin; His; Gly; Ser; Thr 


Asp 


Glu, Ser; Thr 


Gin 


Asn; Ala 


Cys 


Ser;. Gly 


Glu 


Asp 


Gly 


Pro; Arg 


His 


Asn; Gin; Tyr; Phe; Lys; Arg 


lie 


Ala; Leu; Val; Gly; Met 


Leu 


Ala; Be; Val; Gly; Met 


Lys 


Arg; His; Gin; Gly; Pro 


Met 


Leu; He; Phe 


Phe 


Met; Leu; Tyr; Trp; His; Val; 




Ala 


Ser 


Thr; Gly; Asp; Ala; Val; He; His 


Thr 


Ser; Val; Ala; Gly 


Tip 


Tyr; Phe; His 


Tyr 


Trp; Phe; His 


Val 


Ala; lie; Leu; Gly; Thr; Ser; Glu 



Substitutions that are less conservative than those in Table 2 can be selected 
by picking residues that differ more significantly in their effect on maintaining (a) the 
structure of the polypeptide backbone in the area of the substitution, for example, as a 
sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the 
target site, or (c) the bulk of the side chain. The substitutions which in general are 
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expected to produce the greatest changes in protein properties will be those in which 
(a) a hydrophilic residue, e.g., seryl or threonyl, is substituted for (or by) a 
hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a 
cysteine or proline is substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side 
chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., 
glycine. 

VIII. Further Modifying Sequences of the Invention - Mutation/Forced 
Evolution 

In addition to generating silent or conservative substitutions as noted, above, 
the present invention optionally includes methods of modifying the sequences of the 
Sequence Listing. In the methods, nucleic acid or protein modification methods are 
used to alter the given sequences to produce new sequences and/or to chemically or 
enzymatically modify given sequences to change the properties of the nucleic acids or 
proteins. 

Thus, in one embodiment, given nucleic acid sequences are modified, e.g., 
according to standard mutagenesis or artificial evolution methods to produce modified 
sequences. The modified sequences may be created using purified natural 
polynucleotides isolated from any organism or may be synthesized from purified 
compositions and chemicals using chemical means well know to those of skill in the 
art. For example, Ausubel, supra, provides additional details on mutagenesis 
methods. Artificial forced evolution methods are described, for example, by Stemmer 
(1994) Nature 370:389-391, Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747- 
10751, and U.S. Patents 5,81 1,238, 5,837,500, and 6,242,568. Methods for 
engineering synthetic transcription factors and other polypeptides are described, for 
example, by Zhang et al. (2000) J. Biol. Chem. 275:33850-33860, Liu et al. (2001) I 
Biol. Chem. 276:11323-11334, and Isalan et al. (2001) Nature Biotechnol. 19:656- 
660. Many other mutation and evolution methods are also available and expected to 
be within the skill of the practitioner. 
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Similarly, chemical or enzymatic alteration of expressed nucleic acids and 
polypeptides can be performed by standard methods. For example, sequence can be 
modified by addition of lipids, sugars, peptides, organic or inorganic compounds, by 
the inclusion of modified nucleotides or amino acids, or the like. For example, 
protein modification techniques are illustrated in Ausubel, supra. Further details on 
chemical and enzymatic modifications can be found herein. These modification 
methods can be used to modify any given sequence, or to modify any sequence 
produced by the various mutation and artificial evolution modification methods noted 
herein. 

Accordingly, the invention provides for modification of any given nucleic acid 
by mutation, evolution, chemical or enzymatic modification, or other available 
methods, as well as for the products produced by practicing such methods, e.g., using 
the sequences herein as a starting substrate for the various modification approaches. 

For example, optimized coding sequence containing codons preferred by a 
particular prokaryotic or eukaryotic host can be used e.g., to increase the rate of 
translation or to produce recombinant RNA transcripts having desirable properties, 
such as a longer half-life, as compared with transcripts produced using a non- 
optimized sequence. Translation stop codons can also be modified to reflect host 
preference. For example, preferred stop codons for Saccharomyces cerevisiae and 
mammals are TAA and TGA, respectively. The preferred stop codon for 
monocotyledonous plants is TGA, whereas insects and E. coli prefer to use TAA as 
the stop codon. 

The polynucleotide sequences of the present invention can also be engineered 
in order to alter a coding sequence for a variety of reasons, including but not limited 
to, alterations which modify the sequence to facilitate cloning, processing and/or 
expression of the gene product. For example, alterations are optionally introduced 
using techniques which are well known in the art, e.g., site-directed mutagenesis, to 
insert new restriction sites, to alter glycosylation patterns, to change codon preference, 
to introduce splice sites, etc. 
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Furthermore, a fragment or domain derived from any of the polypeptides of 
the invention can be combined with domains derived from other transcription factors 
or synthetic domains to modify the biological activity of a transcription factor. For 
instance, a DNA-binding domain derived from a transcription factor of the invention 
can be combined with the activation domain of another transcription factor or with a 
synthetic activation domain. A transcription activation domain assists in initiating 
transcription from a DNA-binding site. Examples include the transcription activation 
region of VP16 or GAL4 (Moore et al. (1998) Proc. Natl. Acad. Sci. USA 95: 376- 
381; and Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from 
bacterial sequences (Ma and Ptashne (1987) Cell 51; 113-1 19) and synthetic peptides 
(Giniger and Ptashne, (1987) Nature 330:670-672). 

IX. Expression and Modification of Polypeptides 

Typically, polynucleotide sequences of the invention are incorporated into 
recombinant DNA (or RNA) molecules that direct expression of polypeptides of the 
invention in appropriate host cells, transgenic plants, in vitro translation systems, or 
the like. Due to the inherent degeneracy of the genetic code, nucleic acid sequences 
which encode substantially the same or a functionally equivalent amino acid sequence 
can be substituted for any listed sequence to provide for cloning and expressing the 
relevant homologue. 

X. Vectors, Promoters, and Expression Systems 

The present invention includes recombinant constructs comprising one or 
more of the nucleic acid sequences herein. The constructs typically comprise a 
vector, such as a plasmid, a cosmid, a phage, a virus (e.g., a plant virus), a bacterial 
artificial chromosome (BAC), a yeast artificial chromosome (Y AC), or the like, into 
which a nucleic acid sequence of the invention has been inserted, in a forward or 
reverse orientation. In a preferred aspect of this embodiment, the construct further 
comprises regulatory sequences, including, for example, a promoter, operably linked 
to the sequence. Large numbers of suitable vectors and promoters are known to those 
of skill in the art, and are commercially available. 

General texts that describe molecular biological techniques useful herein, 
including the use and production of vectors, promoters and many other relevant 
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topics, include Berger, Sambrook and Ausubel, supra. Any of the identified sequences 
can be incorporated into a cassette or vector, e.g., for expression in plants. A number of 
expression vectors suitable for stable transformation of plant cells or for the 
establishment of transgenic plants have been described including those described in 
Weissbach and Weissbach, (1989,) Methods for Plant Molecular Biology, Academic 
Press, and Gelvin et al, (1990) Plant Molecular Biology Manual, Kluwer Academic 
Publishers. Specific examples include those derived from a Ti plasmid of 
Agrobacterium tumefaciens, as well as those disclosed by Herrera-Estrella et al. 
(1983) Nature 303: 209, Bevan (1984) Nucl Acid Res. 12: 871 1-8721, Klee (1985) 
Bio/Technology 3: 637-642, for dicotyledonous plants. 

Alternatively, non-Ti vectors can be used to transfer the DNA into 
monocotyledonous plants and cells by using free DNA delivery techniques. Such 
methods can involve, for example, the use of liposomes, electroporation, 
microprojectile bombardment, silicon carbide whiskers, and viruses. By using these 
methods transgenic plants such as wheat, rice (Christou (199n Bio/Technology 9: 
957-962) and corn (Gordon-Kamm (1990) Plant Cell 2: 603-618) can be produced. 
An immature embryo can also be a good target tissue for monocots for direct DNA 
delivery techniques by using the particle gun (Weeks et al. (1993) Plant Physiol 102: 
1077-1084; Vasil (1993) Bio/Technology 10: 667-674; Wan and Lemeaux (1994) 
Plant Physiol 104: 37-48, and for Agrobacterium-mediatod DNA transfer (Ishida et al. 
(1996) Nature Biotech 14: 745-750). 

Typically, plant transformation vectors include one or more cloned plant 
coding sequence (genomic or cDNA) under the transcriptional control of 5 ! and 3' 
regulatory sequences and a dominant selectable marker. Such plant transformation 
vectors typically also contain a promoter (e.g., a regulatory region controlling 
inducible or constitutive, environmentally-or developmentally-regulated, or cell- or 
tissue-specific expression), a transcription initiation start site, an RNA processing 
signal (such as intron splice sites), a transcription termination site, and/or a 
polyadenylation signal. 

Examples of constitutive plant promoters which can be useful for expressing 
the TF sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which 
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confers constitutive, high-level expression in most plant tissues {see, e.g., Odell et al. 
(1985) Nature 313:810-812); the nopaline synthase promoter (An et al. (1988) Plant 
Physiol 88:547-552); and the octopine synthase promoter (Fromm et al. (1989) Plant 
Cell 1:977-984). 

A variety of plant gene promoters that regulate gene expression in response to 
environmental, hormonal, chemical, developmental signals, and in a tissue-active 
manner can be used for expression of a TF sequence in plants. Choice of a promoter 
is based largely on the phenotype of interest and is determined by such factors as 
tissue (e.g., seed, fruit, root, pollen, vascular tissue, flower, carpel, etc.), inducibility 
(e.g., in response to wounding, heat, cold, drought, light, pathogens, etc.), timing, 
developmental stage, and the like. Numerous known promoters have been 
characterized and can favorably be employed to promote expression of a 
polynucleotide of the invention in a transgenic plant or cell of interest. For example, 
tissue specific promoters include: seed-specific promoters (such as the napin, 
phaseolin or DC3 promoter described in US Pat. No. 5,773,697), fruit-specific 
promoters that are active during fruit ripening (such as the dru 1 promoter (US Pat. 
No. 5,783,393), or the 2A1 1 promoter (US Pat. No. 4,943,674) and the tomato 
polygalacturonase promoter (Bird et al. (1988) Plant Mol Biol 1 1 :651), root-specific 
promoters, such as those disclosed in US Patent Nos. 5,618,988, 5,837,848 and 
5,905,186, pollen-active promoters such as PTA29, PTA26 and PTA13 (US Pat. No. 
5,792,929), promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol 
Biol 37:977-988), flower-specific (Kaiser et al, (1995) Plant Mol Biol 28:231-243), 
pollen (Baerson et al. (1994) Plant Mol Biol 26:1947-1959), carpels (Ohl et al. (1990) 
Plant Cell 2:837-848), pollen and ovules (Baerson et al. (1993) Plant Mol Biol 
22:255-267), auxin-inducible promoters (such as that described in van der Kop et al. 
(1999) Plant Mol Biol 39:979-990 or Baumann et al. (1999) Plant Cell 1 1 :323-334), 
cytokinin-inducible promoter (Guevara-Garcia (1998) Plant Mol Biol 38:743-753), 
promoters responsive to gibberellin (Shi et al. (1998) Plant Mol Biol 38:1053-1060, 
Willmott et al. (1998) 38:817-825) and the like. Additional promoters are those that 
elicit expression in response to heat (Ainley et al. (1993) Plant Mol Biol 22: 13-23), 
light (e.g., the pea rbcS-3A promoter, Kuhlemeier et al. (1989) Plant Cell 1:471, and 
the maize rbcS promoter, Schaf&er and Sheen (1991) Plant Cell 3: 997); wounding 
(e.g., wunl 9 Siebertz et al. (1989) Plant Cell 1: 961); pathogens (such as the PR-1 
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promoter described in Buchel et al. (1999) Plant Mol. Biol. 40:387-396, and the 
PDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-80), 
and chemicals such as methyl jasmonate or salicylic acid (Gatz et al. (1997) Plant Mol 
Biol 48: 89-108). In addition, the timing of the expression can be controlled by using 
promoters such as those acting at senescence (An and Amazon (1995) Science 270: 
1986-1988); or late seed development (Odell et al. (1994) Plant Physiol 106:447-458). 

Plant expression vectors can also include RNA processing signals that can be 
positioned within, upstream or downstream of the coding sequence. In addition, the 
expression vectors can include additional regulatory sequences from the 3- 
untranslated region of plant genes, e.g., a 3 1 terminator region to increase mRNA 
stability of the mRNA, such as the PI-II terminator region of potato or the octopine or 
nopaline synthase 3' terminator regions. 

Additional Expression Elements 

Specific initiation signals can aid in efficient translation of coding sequences. 
These signals can include, e.g., the ATG initiation codon and adjacent sequences. In 
cases where a coding sequence, its initiation codon and upstream sequences are 

• inserted into the appropriate expression vector, no additional translational control 
signals may be needed. However, in cases where only coding sequence (e.g., a 

. mature protein coding sequence), or a portion thereof, is inserted, exogenous 
transcriptional control signals including the ATG initiation codon can be separately 
provided. The initiation codon is provided in the correct reading frame to facilitate 
transcription. Exogenous transcriptional elements and initiation codons can be of 
various origins, both natural and synthetic. The efficiency of expression can be 
enhanced by the inclusion of enhancers appropriate to the cell system in use. 

Expression Hosts 

The present invention also relates to host cells which are transduced with 
vectors of the invention, and the production of polypeptides of the invention 
(including fragments thereof) by recombinant techniques. Host cells are genetically 
engineered (i.e., nucleic acids are introduced, e.g., transduced, transformed or 
transfected) with the vectors of this invention, which may be, for example, a cloning 
vector or an expression vector comprising the relevant nucleic acids herein. The 
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vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acid, etc. The 
engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transformants, or amplifying the 
relevant gene. The culture conditions, such as temperature, pH and the like, are those 
previously used with the host cell selected for expression, and will be apparent to 
those skilled in the art and in the references cited herein, including, Sambrook and 
Ausubel. 

The host cell can be a eukaryotic cell, such as a yeast cell, or a plant cell, or 
the host cell can be a prokaryotic cell, such as a bacterial cell. Plant protoplasts are 
also suitable for some applications. For example, the DNA fragments are introduced 
into plant tissues, cultured plant cells or plant protoplasts by standard methods 
including electroporation (Fromm et al.> (1985) Proc. Natl. A cad. Sci. USA 82, 5824, 
infection by viral vectors such as cauliflower mosaic virus (CaMV) (Hohn et al, 
(1982) Molecular Biology of Plant Tumors, (Academic Press, New York) pp. 549- 
560; US 4,407,956), high velocity ballistic penetration by small particles with the 
nucleic acid either within the matrix of small beads or particles, or on the surface 
(Klein et al., (1987) Nature 327, 70-73), use of pollen as vector (WO 85/01856), or 
use oiAgrobacterium tumefaciens or A. rhizogenes carrying a T-DNA plasmid in 
which DNA fragments are cloned. The T-DNA plasmid is transmitted to plant cells 
upon infection by Agrobacterium tumefaciens, and a portion is stably integrated into 
the plant genome (Horsch et al. (1984) Science 233:496-498; Fraley et al. (1983) 
Proc. Natl. Acad. Sci. USA 80, 4803). 

The cell can include a nucleic acid of the invention which encodes a 
polypeptide, wherein the cells expresses a polypeptide of the invention. The cell can 
also include vector sequences, or the like. Furthermore, cells and transgenic plants 
that include any polypeptide or nucleic acid above or throughout this specification, 
e.g., produced by transduction of a vector of the invention, are an additional feature of 
the invention. 

For long-term, high-yield production of recombinant proteins, stable 
expression can be used. Host cells transformed with a nucleotide sequence encoding 
a polypeptide of the invention are optionally cultured under conditions suitable for the 
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expression and recovery of the encoded protein from cell culture. The protein or 
fragment thereof produced by a recombinant cell may be secreted, membrane-bound, 
or contained intracellular^, depending on the sequence and/or the vector used. As 
will be understood by those of skill in the art, expression vectors containing 
polynucleotides encoding mature proteins of the invention can be designed with 
signal sequences which direct secretion of the mature polypeptides through a 
prokaryotic or eukaryotic cell membrane. 

XI. Modified Amino Acid Residues 

Polypeptides of the invention may contain one or more modified amino acid 
residues. The presence of modified amino acids may be advantageous in, for 
example, increasing polypeptide half-life, reducing polypeptide antigenicity or 
toxicity, increasing polypeptide storage stability, or the like. Amino acid residue(s) 
are modified, for example, co-translationally or post-translationally during 
recombinant production or modified by synthetic or chemical means. 

Non-limiting examples of a modified amino acid residue include incorporation 
or other use of acetylated amino acids, glycosylated amino acids, sulfated amino 
acids, prenylated (e.g., farnesylated, geranylgeranylated) amino acids, PEG modified 
(e.g., "PEGylated") amino acids, biotinylated amino acids, carboxylated amino acids, 
phosphorylated amino acids, etc. References adequate to guide one of skill in the 
modification of amino acid residues are replete throughout the literature. 

The modified amino acid residues may prevent or increase affinity of the 
polypeptide for another molecule, including, but not limited to, polynucleotide, 
proteins, carbohydrates, lipids and lipid derivatives, and other organic or synthetic 
compounds. 

XII. Identification of Additional Factors 

A transcription factor provided by the present invention can also be used to 
identify additional endogenous or exogenous molecules that can affect a phentoype or 
trait of interest. On the one hand, such molecules include organic (small or large 
molecules) and/or inorganic compounds that affect expression of (i.e., regulate) a 
particular transcription factor. Alternatively, such molecules include endogenous 
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molecules that are acted upon either at a transcriptional level by a transcription factor 
of the invention to modify a phenotype as desired. For example, the transcription 
factors can be employed to identify one or more downstream gene with which is 
subject to a regulatory effect of the transcription factor. In one approach, a 
transcription factor or transcription factor homologue of the invention is expressed in 
a host cell, e.g., a transgenic plant cell, tissue or explant, and expression products, 
either RNA or protein, of likely or random targets are monitored, e.g., by 
hybridization to a microarray of nucleic acid probes corresponding to genes expressed 
in a tissue or cell type of interest, by two-dimensional gel electrophoresis of protein 
products, or by any other method known in the art for assessing expression of gene , 
products at the level of RNA or protein. Alternatively, a transcription factor of the 
invention can be used to identify promoter sequences (i.e., binding sites) involved in 
the regulation of a downstream target. After identifying a promoter sequence, 
interactions between the transcription factor and the promoter sequence can be 
modified by changing specific nucleotides in the promoter sequence or specific amino 
acids in the transcription factor that interact with the promoter sequence to alter a 
plant trait. Typically, transcription factor DNA-binding sites are identified by gel 
shift assays. After identifying the promoter regions, the promoter region sequences 
can be employed in double-stranded DNA arrays to identify molecules that affect the 
interactions of the transcription factors with their promoters (Bulyk et al. (1999) 
Nature Biotechnology 17:573-577). 

The identified transcription factors are also useful to identify proteins that 
modify the activity of the transcription factor. Such modification can occur by 
covalent modification, such as by phosphorylation, or by protein-protein (homo or- 
heteropolymer) interactions. Any method suitable for detecting protein-protein 
interactions can be employed. Among the methods that can be employed are co- 
immunoprecipitation, cross-linking and co-purification through gradients or 
chromatographic columns, and the two-hybrid yeast system. 

The two-hybrid system detects protein interactions in vivo and is described in 
Chien et al. ((1991), Proc. Natl. Acad. Sci. USA 88:9578-9582) and is commercially 
available from Clontech (Palo Alto, Calif.). In such a system, plasmids are 
constructed that encode two hybrid proteins: one consists of the DNA-binding domain 
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of a transcription activator protein fused to the TF polypeptide and the other consists 
of the transcription activator protein's activation domain fused to an unknown protein 
that is encoded by a cDNA that has been recombined into the plasmid as part of a 
cDNA library. The DNA-binding domain fusion plasmid and the cDNA library are 
transformed into a strain of the yeast Saccharomyces cerevisiae that contains a 
reporter gene (e.g., lacZ) whose regulatory region contains the transcription activator's 
binding site. Either hybrid protein alone cannot activate transcription of the reporter 
gene. Interaction of the two hybrid proteins reconstitutes the functional activator 
protein and results in expression of the reporter gene, which is detected by an assay 
for the reporter gene product. Then, the library plasmids responsible for reporter gene 
expression are isolated and sequenced to identify the proteins encoded by the library 
plasmids. After identifying proteins that interact with the transcription factors, assays 
for compounds that interfere with the TF protein-protein interactions can be 
preformed. * 

XIII. Identification of Modulators 

In addition to the intracellular molecules described above, extracellular 
molecules that alter activity or expression of a transcription factor, either directly or 
indirectly, can be identified. For example, the methods can entail first placing a 
candidate molecule in contact with a plant or plant cell. The molecule can be 
introduced by topical administration, such as spraying or soaking of a plant, and then 
the molecule's effect on the expression or activity of the TF polypeptide or the 
expression of the polynucleotide monitored. Changes in the expression of the TF 
polypeptide can be monitored by use of polyclonal or monoclonal antibodies, gel 
electrophoresis or the like. Changes in the expression of the corresponding 
polynucleotide sequence can be detected by use of microarrays, Northerns, 
quantitative PCR, or any other technique for monitoring changes in mRNA 
expression. These techniques are exemplified in Ausubel et al. (eds) Current 
Protocols in Molecular Biology, John Wiley & Sons (1998, and supplements through 
2001). Such changes in the expression levels can be correlated with modified plant 
traits and thus identified molecules can be useful for soaking or spraying on fruit, 
vegetable and grain crops to modify traits in plants. 
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Essentially any available composition can be tested for modulatory activity of 
expression or activity of any nucleic acid or polypeptide herein. Thus, available 
libraries of compounds such as chemicals, polypeptides, nucleic acids and the like can 
be tested for modulatory activity. Often, potential modulator compounds can be 
dissolved in aqueous or organic (e.g., DMSO-based) solutions for easy delivery to the 
cell or plant of interest in which the activity of the modulator is to be tested. 
Optionally, the assays are designed to screen large modulator composition libraries by 
automating the assay steps and providing compounds from any convenient source to 
assays, which are typically run in parallel (e.g., in microtiter formats on microtiter 
plates in robotic assays). 

. In one embodiment, high throughput screening methods involve providing a 
combinatorial library containing a large number of potential compounds (potential 
modulator compounds). Such "combinatorial chemical libraries" are then screened in 
one or more assays, as described herein, to identify those library members (particular 
chemical species or subclasses) that display a desired characteristic activity. The 
compounds thus identified can serve as target compounds. 

A combinatorial chemical library can be, e.g., a collection of diverse chemical 
compounds generated by chemical synthesis or biological synthesis. For example, a 
combinatorial chemical library such as a polypeptide library is formed by combining a 
set of chemical building blocks (e.g., in one example, amino acids) in every possible 
way for a given compound length (i.e., the number of amino acids in a polypeptide 
compound of a set length). Exemplary libraries include peptide libraries, nucleic acid 
libraries, antibody libraries (see, e.g., Vaughn et al. (1996) Nature Biotechnology, 
14(3):309-314 and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang et al. 
Science (1996) 274:1520-1522 and U.S. Patent 5,593,853), peptide nucleic acid 
libraries (see, e.g., U.S. Patent 5,539,083), and small organic molecule libraries (see, 
e.g., benzodiazepines, Baum C&EN Jan 18, page 33 (1993); isoprenoids, U.S. Patent 
5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, 
U.S. Patents 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent 
5,506,337) and the like. 
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Preparation and screening of combinatorial or other libraries is well known to 
those of skill in the art. Such combinatorial chemical libraries include, but are not 
limited to, peptide libraries (see, e.g., U.S. Patent 5,010,175; Furka, (1991) Int. J. 
Pent. Prot. Res. 37:487-493; and Houghton et al. (1991) Nature 354:84-88). Other 
chemistries for generating chemical diversity libraries can also be used. 

In addition, as noted, compound screening equipment for high-throughput 
screening is generally available, e.g., using any of a number of well known robotic 
systems that have also been developed for solution phase chemistries useful in assay 
systems. These systems include automated workstations including an automated 
synthesis apparatus and robotic systems utilizing robotic arms. Any of the above 
devices are suitable for use with the present invention, e.g., for high-throughput 
screening of potential modulators. The nature and implementation of modifications to 
these devices (if any) so that they can operate as discussed herein will be apparent to 
persons skilled in the relevant art. 

Indeed, entire high throughput screening systems are commercially available. 
These systems typically automate entire procedures including all sample and reagent 
pipetting, liquid dispensing, timed incubations, and final readings of the microplate in 
detector(s) appropriate for the assay. These configurable systems provide high 
throughput and rapid start up as well as a high degree of flexibility and customization. 
Similarly, microfluidic implementations of screening are also commercially available. 

The manufacturers of such systems provide detailed protocols the various high 
throughput. Thus, for example, Zymark Corp. provides technical bulletins describing 
screening systems for detecting the modulation of gene transcription, ligand binding, 
and the like. The integrated systems herein, in addition to providing for sequence 
alignment and, optionally, synthesis of relevant nucleic acids, can include such 
screening apparatus to identify modulators that have an effect on one or more 
polynucleotides or polypeptides according to the present invention. 

In some assays it is desirable to have positive controls to ensure that the 
components of the assays are working properly. At least two types of positive 
controls are appropriate. That is, known transcriptional activators or inhibitors can be 
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incubated with cells/plants/ etc. in one sample of the assay, and the resulting 
increase/decrease in transcription can be detected by measuring the resulting increase 
in RNA/ protein expression, etc., according to the methods herein. It will be 
appreciated that modulators can also be combined with transcriptional activators or 
inhibitors to find modulators that inhibit transcriptional activation or transcriptional 
repression. Either expression of the nucleic acids and proteins herein or any 
additional nucleic acids or proteins activated by the nucleic acids or proteins herein, 
or both; can be monitored. 

In an embodiment, the invention provides a method for identifying 
compositions that modulate the activity or expression of a polynucleotide or 
polypeptide of the invention. For example, a test compound, whether a small or large 
molecule, is placed in contact with a cell, plant (or plant tissue or explant), or 
composition comprising the polynucleotide or polypeptide of interest and a resulting 
effect on the cell, plant, (or tissue or explant) or composition is evaluated by 
monitoring, either directly or indirectly, one or more of: expression level of the 
polynucleotide or polypeptide, activity (or modulation of the activity) of the 
polynucleotide or polypeptide. In some cases, an alteration in a plant phenotype can 
be detected following contact of a plant (or plant cell, or tissue or explant) with the 
putative modulator, e.g., by modulation of expression or activity of a polynucleotide 
or polypeptide of the invention. Modulation of expression or activity of a 
polynucleotide or polypeptide of the invention may also be caused by molecular 
elements in a signal transduction second messenger pathway and such modulation can 
affect similar elements in the same or another signal transduction second messenger 
pathway. 

XIV. Subsequences 

Also contemplated are uses of polynucleotides, also referred to herein as 
oligonucleotides, typically having at least 12 bases, preferably at least 15, more 
preferably at least 20, 30, or 50 bases, which hybridize under at least highly stringent 
(or ultra-high stringent or ultra-ultra-high stringent conditions) conditions to a 
polynucleotide sequence described above. The polynucleotides may be used as 
probes, primers, sense and antisense agents, and the like, according to methods as 
noted supra. 
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Subsequences of the polynucleotides of the invention, including 
polynucleotide fragments and oligonucleotides are useful as nucleic acid probes and 
primers. An oligonucleotide suitable for use as a probe or primer is at least about 15 
nucleotides in length, more often at least about 18 nucleotides, often at least about 21 
nucleotides, frequently at least about 30 nucleotides, or about 40 nucleotides, or more 
in length. A nucleic acid probe is useful in hybridization protocols, e.g., to identify 
additional polypeptide homologues of the invention, including protocols for 
microarray experiments. Primers can be annealed to a complementary target DNA 
strand by nucleic acid hybridization to form a hybrid between the primer and the 
target DNA strand, and then extended along the target DNA strand by a DNA 
polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid 
sequence, e.g., by the polymerase chain reaction (PCR) or other nucleic-acid 
amplification methods. See Sambrook and Ausubel, supra. 

In addition, the invention includes an isolated or recombinant polypeptide 
including a subsequence of at least about 15 contiguous amino acids encoded by the 
recombinant or isolated polynucleotides of the invention. For example, such 
polypeptides, or domains or fragments thereof, can be used as immunogens, e.g., to 
produce antibodies specific for the polypeptide sequence, or as probes for detecting a 
sequence of interest. A subsequence can range in size from about 15 amino acids in 
length up to and including the full length of the polypeptide. 

To be encompassed by the present invention, an expressed polypeptide which 
comprises such a polypeptide subsequence performs at least one biological function 
of the intact polypeptide in substantially the same maimer, or to a similar extent, as 
does the intact polypeptide. For example, a polypeptide fragment can comprise a 
recognizable structural motif or functional domain such as a DNA binding domain 
that binds to a specific DNA promoter region, an activation domain or a domain for 
protein-protein interactions. 
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XV. Production of Transgenic Plants 

Modification of Traits 

The polynucleotides of the invention are favorably employed to produce 
transgenic plants with various traits, or characteristics, that have been modified in a 
desirable manner, e.g., to improve the seed characteristics of a plant. For example, 
alteration of expression levels or patterns (e.g., spatial or temporal expression 
patterns) of one or more of the transcription factors (or transcription factor 
homologues) of the invention, as compared with the levels of the same protein found 
in a wild type plant, can be used to modify a plant's traits. An illustrative example of 
trait modification, improved characteristics, by altering expression levels of a 
particular transcription factor is described further in the Examples and the Sequence 
Listing. 

Arabidopsis as a model system 

Arabidopsis thaliana is the object of rapidly growing attention as a model for 
genetics and metabolism in plants. Arabidopsis has a small genome, and well 
documented studies are available. It is easy to grow in large numbers and mutants 
defining important genetically controlled mechanisms are either available, or can 
readily be obtained. Various methods to introduce and express isolated homologous 
genes are available (see Koncz, et al., eds. Methods in Arabidopsis Research, et al. 
(1992), World Scientific, New Jersey, New Jersey, in "Preface"). Because of its small 
size, short life cycle, obligate autogamy and high fertility, Arabidopsis is also a 
choice organism for the isolation of mutants and studies in morphogenetic and 
development pathways, and control of these pathways by transcription factors (Koncz, 
supra, p. 72). A number of studies introducing transcription factors into A. thaliana 
have demonstrated the utility of this plant for understanding the mechanisms of gene 
regulation and trait alteration in plants. See, for example, Koncz, supra, and U.S. 
Patent Number 6,417,428). 

Arabidopsis genes in transgenic plants. 

Expression of genes which encode transcription factors modify expression of 
endogenous genes, polynucleotides, and proteins are well known in the art. In 
addition, transgenic plants comprising isolated polynucleotides encoding transcription 
factors may also modify expression of endogenous genes, polynucleotides, and 
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proteins. Examples include Peng et al. (1997, Genes and Development 11:3194- 
3205) and Peng et al. (1999, Nature, 400:256-261). In addition, many others have 
demonstrated that an Arabidopsis transcription factor expressed in an exogenous plant 
species elicits the same or very similar phenotypic response. See, for example, Fu et 
al. (2001, Plant Cell 13:1791-1802); Nandi et al. (2000, Curr. Biol. 10:215-218); 
Coupland (1995, Nature 377:482-483); and Weigel and Nilsson (1995, Nature 
377:482-500). 

Homologous genes introduced into transgenic plants. 

Homologous genes that may be derived from any plant, or from any source 
whether natural, synthetic, semi-synthetic or recombinant, and that share significant 
sequence identity or similarity to those provided by the present invention, may be 
introduced into plants, for example, crop plants, to confer desirable or improved traits. 
Consequently, transgenic plants may be produced that comprise a recombinant 
expression vector or cassette with a promoter operably linked to one or more 
sequences homologous to presently disclosed sequences. The promoter may be, for., 
example, a plant or viral promoter. 

The invention thus provides for methods for preparing transgenic plants, and 
for modifying plant traits. These methods include introducing into a plant a 
recombinant expression vector or cassette comprising a functional promoter operably 
linked to one or more sequences homologous to presently disclosed sequences. Plants 
and kits for producing these plants that result from the application of these methods 
are also encompassed by the present invention. 

The complete descriptions of the traits associated with each polynucleotide of 
the invention is fully disclosed in Table 4, Table 5, and Table 6. 



54 



WO 03/013227 



PCT/US02/25805 



: c 

.-'© 

e 

o 

13 
© 

.;- © 

• C 

: o 


(113-169) 


lO 
00 


(221-246) 


(118-181) 


in 

05 
i 

to 

CO 

T— 


I 


(182-248) 


(118-184) 


in 

CO 

i 

00 


Q 


CO 

T~ 

CM 
• 

CM 


CO 
lO 

1 

CO 
CO 


T— 

°? 
00 


to 

CO 

lO 
CM 


to 

00 
-J, 
CNI 
CM 


So 

§i;d 


CM 

i 




CD 


CO 


o 


CM 




CO 


00 


o 

CM 


CM 
CM 


Si 


CO 
CM 


CO 
CM 


O 

CO 


. i; •: 

- \ : 

li 

i 


c 

CD 
Q. 

75 
E 

CO 

8 

c 

CO 

c 

E 
o 
*o 

s 

"5. 

CO 

•o 

CD 

o 

3 

■a 

CD 


CD 
O 

c 

CO 

c 

E 
o 

"D 

to 

O 

'5_ 

CO 

•6 

CO 
CO 

o 
—I 


?5 

~ o 

3 Q_ 

Is 
§ $ 

. - -o 

CO CD 

E | 

CD ^ 
to CO 

CD jg 

o © 

■5 "8 

»" % 
S £ 

C . - 

E .52 

5 CO 
■D C 
"CO g, 

.2 o 
CO "a. 

"O o 
(Dp*; 
o c c 

3 O © 
"DOC 
Ct CL O 


© 

c 

© 

o 

2 
o 

-1 
2 8 

2 c 
"to © 
2 

2 o) 

3 -£ 

© "D 
»= © 

1 §. 

a.*© 

"O CD 

2 "© 

3 -a 

< ro 


© 

© 
CL 
© 

CO 

"5 

© 

o 
*o 

2 = 
© o 

© Q) 
CD 

2 w 

3 "D 

© © 
H 2 
p o 

S-s 

ro o 

^8 

2 c 

s i 

< O) 


is 

C 
CO 
CL 
w 
© 

© 

E 

CO 

c 
© 
2 

O) 

© 
© 

Q. 

© 
o 
c 
ro 
c 

1 

o 
■o 

Si 1 

© CD 

© -o 

o © 
3 O 
"O 3 

© "5 
S 2 


c 
© 

2 

T3 
C 
© 

> o 

3 73 

3 © 

US CD 
CO CO 

§1 

3 © 

o CO 

5 » 

E © 
2 g 

© CO 
*-» o 
c x: 

CO Q. 

a o 
"a E c 
2 3© 
© o c 


to 

N 

*co 
*o 

1 8 

3 
T3 

2 

2 

% 
© 

IE 

2 
© 

c 

© 


2 

3 

o 

I 

CD 
O 

c 

8 

CO 

2 
o 

c?= 
.E 

13 

2 
2 


i 

cs= 

3 

JO 

2 

3 
O 
© 

x: 
o 

ro 
«♦-» 
c 
ro 

Q. 

13 

2 
3 
< 


i_ 

O 

o 
o 
c 

CD 

2 

CO 
© 
CO 
CL 

m 
o 
c 

CO 

c 

E 

o 

13 

8 

C 

ro 

T3 

8 

3 

T3 
CD 
QC 


c 
© 

2 

CL 

■a 
c 
© 

*o 

-a 
© 
© 
© 

2 
2 
ro 

© 
o 
c 

TO 
C 

E 
o 
•o 

ro 
o 

"CL 
© 

8| 

3 .2 

CD § 
DC 8 


C 

CD 

E 

CL 
O 

o 

s 

-o 

o 
o 
x: 

CO 

u 
2 

CD 


c 

CD 

E 

CL 

o 

© 

•4-1 

o 
o 
x: 

CO 
*D 

2 
2 


.£ 
*c 
.a> 

■o 

© 
o 

3 

?. 

CO 

c 

o 
c 

2 

X3 

1 

3 

13 
CD 

C£ 


•;-;U'aCO 
::^U- 

• .- • 


on 
> 

>• 


CM 

a. 
< 


1 

c*o 


i 

X 

_i 

T 


CD 
X 


*o 
« 

2 
i 

CQ 

V- 


CM 


CM 


CN 
CL 
< 


CM 

LL 
< 


13 

2 
© 

2 

£ 


CM 

Li- 

< 


CQ 
X 


CD 
X 


GO 
X 


r S 
§ ■§ 

CO 

o 


n 

£ 
o 
E 

■a 
c 

CO 

> 

© 
Q 


n 

& 

o 
E 

■o 
c 

CO 

> 

0) 

Q 


-a > 

CD -b 
CD CO 

? • E 

Si (D 

> tr o 
CD o o 
O E b 


x: 

9r 
o 

E 

C 
© 
> 
© 

Q 


■a 2? 

CD i= 
0) © 

"O » 1 

> tr o 

CD O O 

Q E S 


j= 

& 
o 
E 

T3 
C 
CO 

a 


■© • £ 
i jc : © 
> tr o 

CD O O 

Q E 15 


x: 

Q. 
O 

E 

T3 
C 
CO 

> 

CD 
Q 


x: 
c 

o 
E 

•o 
c 
ro 

> 
© 
Q 


© 

E 

"xfl 

> & I 
S ?i o 
O E eg 


x: 

E> 
o 
E 

13 
C 
© 
> 
© 

G 


13 

CD <b 
CD CO 

13 W g 

5 if © 
co a. x: 

CD o o 
Q E X5 


x: 
Cl 

o 
E 

T3 
C 
CO 

> 

CD 

Q 


x: 

CL 

o 
E 

-a 
c 

CO 

> 
Q 


x: 

B 
o 
E 

TD 
C 

ro 
> 
© 

a 


J— > 


2 

O 
CD 

tit 

< CO 


2 

zi 
o 

CD 
4-» 

1 


. „ < 
© c 

■ - CO © 

g c ■£ 

3 O -O Q 

O W m " 

2 2 © .£ 

Si, .81 


© >> 

b= CD 

3 o 
^ o 

1 I &fe 

< 5= C O 


llii 


^ co 5 

0 © 
® S 

1 2" © 

2 © £ 

< E o 


CD C 
© CD O .C 

£ i ■© 3 
c5 £ © o 

3 CO CD g 


© 

i 

2 

< © 


2 
© 

< 


© 

2-i 

3 *^ 

■8 g 

© -c 

o > 
< 0= 


2 

■i 

© 

SZ ^_ 

2 © 
< 2 


is?i 

3 «; r*» 

o — ° 
© o c 

Zo'3 
p © o 
< co a 


2 

2 
!E 

1 


2 

i 

1 


2 

3 

Si 

J J 

< CO 


.- d 

. -z 

Q 

> CD 


CM 

T— 

CD 


cd 


00 
00 

T— 

O 


CO 
CO 

T— 

o 


CO 

in 
5 


CO 

CO 

CO 
v- 

CD 


CO 

o 


CO 
CO 
00 

5 


CO 

o 

CN 

CD 


CO 
CN 
CN 

CD 


CM 
lO 

cm- 
CD 


CO 

o 

to 

CM 

CD 


o 

en 

CO 

CD 


T— 

CD 
CO 

CD 


00 
CO 

CD 


So 




CO 


in 




CO 




CO 
r— 


lO 




CO 


CM 


CO 
CM 


to 

CM 


CM 


CO 
CM 



WO 03/013227 



PCT7US02/25805 



3 



10 
00 

CO 



V 00 

5 2 

2 CO 



© 



C T3 

is 

O CO 
© — 

s = 

flj 

1 1 



CO 



CO 



(J) 

0 jo 

0 CO 

1 § 

© as 
fj o 



g ° r 

CO p 

-a 



< to o 



CD 
O 

-a 



c 

§ 

o 
© 

JO 
CD 

c 

".rr 
o 
c 

2 
■a 



c 

CD 

£ 
c 

s 

0 

I 

O 
5= 
"D 



C 

o 
o 

"O 

2 
a. 
c 

© 

O 
CL 

O 

c 

o 
© 

*5 



| c 

cd co 

f I 

CD <U 
II 

I? 

© o 



£ 

CO 

£2 
o 

(0 

£ 
c 
o 
c 

_D 
CO 

cS c 

O CD 
C ~ 

!! 

•8 5 
H 

"CL *0 
CO c 

•o m 

8^ 

-§"§ 



■o 

0 

JO Q CO 



j_ o 

CO "D 



.0 
ECO) 

* I "g 

CD 0 0 

£ w 

... 0 

cj? © c 

•C T3 -O 
£ © 1- 
£ CL CO 

-Q o o 

CO > 



CO 

S E? 

CO 0 

of 

CD W 



1 



5 0 

CO t- 
5° 



S 
0 

<D i; 3 
T3 CO 43 
0. g D 



CO — 

E P 

S 8 
§5 

£ jo 

T3 CO 

CD Cl 

O 0 

3 CO 

T3 -rj 

0 c 

a: 0 



c 
ro 

CL 

"0 
E 

CO 

© 

CL 
0 

sz 

CO 

"6 

© 
0 



T3 
C 
0 
CO 
0 
CL 
0 



0 4^ 



0 

E 

CO 

2 

0 
CO 

0 

CL 

s 

"D 
C 
0 
0 

0 
CL 



T3 
C 
0 

*o 

"D 

0 
© 
0 

"O 
P 



0 

T3 
0 

O 

■o 
£ 



■o 8 
8 c 
=?3 



CD 



T3 



0 

E 

0 



> I 
2 i 



o 



sz 

CL 
i_ ' 

o 
E 

■D 
C 
0 

> 
© 

a 



Id -S" 
cd 0 

1 
if J) 

CL 

g .2 

E sn 



CD 0 

•? ?. E 
ro a. x: 
©00 

a E 5 



o 



SI 
CL 

o 
E 

■a 
c 
0 
> 
© 

a 



o 



T3 ?s 
0 ^3 
CD 0 

> u o 
©00 
Q E 5 



a> 
c 

i i g 

- 0 



£ 7= 
CO o 



o 
© 

111 



o > 
< cp 



£ 



o 

0 © 

"I 
if 

cp 0 



c: S -o c 

o ^ © •£ 

< ^ io 0 






CD r- 
N fc 
W S 
0 

.■e l: 
.— © 

% ^ 
© o 



© 
O 

■c: © 

© N 
UL CO 



© -o I 

n c c: 

0 0 g 

ip "o B 

© o c 

LL CO CL 



in 



WO 03/013227 



PCT/US02/25805 



(74-151) 


(13-160) 


(62-151) 


(343-308) 


(107-173) 


(37-120) 


(56-147) 


(43-68) 


Q 
CD 

b 


1(46-115) I 


(93-160) 


1(160-234) ] 


(66-86) 


CM 
O 

' 

CM 
CO 


CO 

o 

1 

T— 


(19-120) 


(37-154) 


(24-90) 


t 

o 

T— 


1(61-393) J 


o 

I s - 


£ 


s 


CD 


CO 

I s - 


o 

CO 


CM 
CO 


s 


CO 
00 


CO 
CO 


o 

CO 


CM 
CO 


S 


CD 
05 


CO 

o> 


o 
o 


CM 
O 
r— 


o 


CD 
O 


CO 

o 

t— 


Reduced fertility; small plant; altered seed oil content 


Reduced fertility; altered inflorescence development 


Reduced fertility; altered flower development; reduced 
size 


Reduced fertility; altered leaf shape and development; 
large pale seed 


Reduced fertility; reduced size; increased seed oil 
content 


Reduced fertility; extended period of flowering; altered 
seed protein content 


Reduced fertility; reduced size; altered seed oil and 
protein content 


Reduced fertility; altered leaf development; reduced size 


Reduced fertility; long petioles, altered orientation; 
altered seed protein content 


| Reduced fertility; reduced size | 


Reduced fertility; reduced size; early senescence 


[Reduced fertility; reduced size 


Reduced fertility; reduced size; altered seed protein 
content 


[Reduced fertility; reduced size 


[Reduced fertility; reduced size 


Reduced fertility; altered flower development; reduced 
size 


[Reduced fertility, size 


[Reduced fertility, small plant 


[Reduced fertility; small plant 


[Short stamen filaments 


AP2 


NAC 


HS 


HB 


AP2 


HS 


HS 


GATA/Zn 


AP2 


|AP2 


AP2 


[HLH/MYC 


AT-hook 


|AP2 


[AP2 


MYB-(R1)R2R3 


[Z-C3H 


|AP2 


|AP2 


[ARF 


Dev and 
morph; seed 
biochemistry 


Dev and morph 


Dev and morph 


Dev and morph 


Dev and 
morph; seed 
biochemistry 


Dev and 
morph; seed 
biochemistry 


Dev and 
morph; seed 
biochemistry 


Dev and morph 


Dev and 
morph; seed 
biochemistry 


I Dev and morph; 


Dev and morph 


| Dev and morph 


Dev and 
morph; seed 
biochemistry 


I Dev and morph 


|Dev and morph 


... 

Dev and morph 


[Dev and morph 


(Dev and morph 


I Dev and morph 


| Dev and morph 


Fertility; size; 
seed oil content 


Fertility; 

morphology: 

other 


Fertility; flower; 
size 


| Fertility; leaf; 
[seed 


Fertility; size; 
seed oil content 


Fertility; flower;- 
seed protein 
content 


Fertility; size; 
seed oil and 
protein content 


Fertility; leaf; 
size 


Fertility; leaf; 
seed protein 
content 


Fertility; size 


Fertility; size; 
senescence 


I Fertility; size 


Fertility; size; 
seed protein 
content 


[Fertility; size 


[Fertility; size 


Fertility; flower; 
size 


[Fertility; size 


I Fertility; size 


Fertility; size 


[Fertility 


G1421 


G1453 


G1560 


G1594 


G1750 


G1947 


G2011 


G2094 


G2113 


G2115 


G2130 


IG2147 


G2156 


G2294 


[G2510 


G2893 


G340 


G39 


1G439 


(G470 


CO 
CO 




CO 

I s - 


in 


h- 


o 


CO 


CO 
CO 


in 

00 


00 


CD 
00 


T- 

05 


CO 

o> 


to 

CO 


CO 


CO 


© 


CO 

o 


in 
o 


o 



WO 03/013227 



PCT7US02/25805 



CM 
00 



CM 
CM 

O 

5 

CO 



o 

> TO 
_© CO 

ii 

3 !■ 

■of 

CD $ 

TJ » 
CD ^ 
g- O 

i| 

n 

£ | 
-a EE 

8 & c 
•ill 

Sis 



TO 
TO 

T5 

£ 

TO 

c 

O 

'to 
w 
o 
co 



TO £ 

I.S 

© c 

O TO 

15 
II 

t = 

CD TO 
* £ 

V w 
© Ii 

P 



TO 

£ 
to 

c 

CD 

£2 

en 



TO 

•o 

TO 

a. 

CO 



CO 
TO 

S> 

TO 
TO 



-a 

CD 

o 

3 

2 



I 

Q. 
T3 
C 
CO 

•a = 
©go 

P- TO "O 
^ CD 
TO CD 

cl a) 

* § * 

c £ c 

TO O) q> 

£ E 

g- TO Q. 
— "O O 
TO - CD 
> © > 
TO a. CD 
"O TO "O 

lis 



•n ti « 
£ £ £ 



5 



TO 

£ 
a. 
o 

TO 

•o 
18 

TO 
T3 

£ 

TO 
TO 

C 

TO 

E 
a_ 
o 

SJ 

TO 

TJ _ 
i- Q. 
TO 

I 

«#= 
T3 
TO 



CD 



s 2 
£• 3 
S oS 

O If 

o o 



TO 



o 
c 

TO 

O) § 

-£ E 

TO CL 

■o .o 1 
£ ® > 

CD Q. £ 

$ » g 

lis 

TO m O 



TO 

is 

c - 
o 
o 

c 

s 

2 



i. C. c r 

to jg a> o 
< to ^ to .£ 



c 

TO 

E 
a. 
o 

s 

TO 
"D 
u. 
TO 

I 

■o 

TO 



TO 

CO 
CO 
^3 

TO 
Q- 



« 8 



< TO 



TO 
3 

.2" 
'to 

CO 

c 

c 
o 

CL 

"S 

TO 

I 



TO 



CD 
CL 

O 

x: 
CO 



8 

C 
*TO 

CL] CO 

If 
£ g 

TO S3 
Si C 

« "5 

CD CL 

W TO 

•o $ 
fli c - 

6 01 
*o 

. CO 

jo 55 
CD TO 
O C 

o a> 
cio 



CD 
N 

t: -a 

TO TO 



TO 



«4 

2 S 



•Q CO E 



c 

TO 

E 

CL 
O 

S 

TO 

•o 

Q. 

"5 
£ 



CD 

£ 

CL 

JO 

f 

TO 

■o 

i 

q= 

TO o 
TO £ 

< -a 



00 

lo 



O 



■Si 

Z 



TO « 

£ if © 
> c- o 

TO O O 

Q E S 



Bl 
o 

E 

■o 
c 

TO 
g 



& 

o 
£ 

C 

TO 

5 



TO CO 

•2 • 1 

£ r (D 

>l* 

CD O O 

Q E S 



CL 
l— 

o 
E 

"O 

c 

TO 



TO 

CD CO 



£ 

TO 

^ o. n 

> u O 
(D O O 

O E d 



i 

-o 
c 

CO 



■o ^ 

TO W 
i co - 



'£ 

£ © 



> > fcr O 
CD TO O O 

Q O £ 15 



-a 

TO -b 

<D CO 

i CO "g 

5" ° 

i o o 

i E 5 



o 



c 

TO 
CO — 

^ 2 *o 



TO 
TO 
CO 

£1 



••e 

CD 



5 TO 
O TO 
5= CO 



o w 

^ TO 
^ CO 

€ h=- 

Q) TO 
LL © 



TO 



O) CD 



ig e- © 



io" I c ^ 

TO TO CO § 

fe TO © £ 

> T3 CD 

LL. .S TO Q. 



TO S 

g © c 

LL S 8 



TO 

4_T O 

TO C 

^ 8 

LT CO 

TO © 

g O 

II c 



c 

T3 © 
TO c 
TO o 

CO g 

Ii 

LL Cl 



Is' 

I- TO 
TO w 

O N 
LL 'CO 



O) 

o 
o 

f fe 
if 



C5 



WO 03/013227 



PCTAJS02/25805 



(109-177) 


(90-210) 


(198-247) | 


(33-42, 78-175) 


(886-896) ! 


(70-127) j 


CO 

°? 

CM 


1 

o 

CO 


(261-311) 




1(28-350) 


X— 

CO 
• 

CN 


(TBD) 


LO 
CO 
• 


(124-149) 


1(17-59) I 


(205-263, 344-404) 


(184-254) 


(100-153) 


(46-106) 


T- 


CO 


CO 

T— 


o 

LO 


CM 
LO 


s 


CO 
LO 

T— 


CO 
LO 


o 

CO 


CM 
CO 


<<* 

CD 

T— 


CD 
CO 
x— 


CO 
CD 

T 


o 


ft! 




CO 

T— 


CO 


o 

CO 


CM 
OO 

T— 


Abnormal anther development; small and spindly plant; 
altered seed fatty acids 


Altered inflorescence structure; altered leaf development 


Altered leaf shape j 


Serrated leaves; increased plant size; flowering appears 
to be slightly delayed 


Altered leaf development | 
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NF036F04RT1 F1 032 Developing root Medica i 


GHMYB25. I 


protein 1. I 


mixta. I 


OSMYB1. I 


myb-reiated transcription factor. I 


myb-related transcription factor. ! 


CI protein. I 


transforming protein (myb) homolog (clone Zm38) I 


$ 

CD 
CN 
CO 

E 
CD 


MybHvS. I 


DNA-binding protein 2 (WRKY2) mRNA, compi I 


SPF1-like DNA-binding protein mRNA, complet | 


| zinc finger transcription factor WRKY1 mRNA, c | 


I Sequence 9 from Patent WO0149840. 1 


| Sweet potato mRNA for SPF1 protein, complet I 


| Aiatua mRNA for DNA-binding protein (clone ABF | 


| mRNA for hypothetical protein (ORF I 


I Sequence 1 1 from Patent WO0149840. I 


| zinc finger protein (ZFP1 ) mRNA, com I 


I DNA-binding protein WRKY1 mRNA, comple I 


I DNA-binding protein 2. __| 


I SPF1-like DNA-binding protein. | 


| zinc finger transcription factor WRKY1 . ! 


| SPF1 protein. . ] 


| DNA-binding protein. | 


| hypothetical protein. I 


| zinc-finger type transcription facto | 


| DNA-binding protein WRKY1 . I 


I zinc finger protein; WRKY1 . J 


| hypothetical protein. ! 


| ( ) chromosome 8 do I 


I chromosome 8 clone P0461 F06, *** SEQUENCING IN \ 


| BOHGT56TR BOHG Brassica oleracea genomic I 


I EST473875 tomato shoot/meristem Lyc ! 


I EST508578 HOGA Medicago truncatula cDNA I 


[Medicago truncatula] | 


[Gossypium hirsutum] 


[Petunia x hybrida] 


[Antirrhinum majus] I 


[Oryza sativa] I 


[Lvcopersicon esculentum] I 


[Pimpinella brachycarpa] " I 


[Zea mays subsp. parviglumis] I 


[Zea mays] ! 


[Glycine max] ! 


[Hordeum vulgare] i 


[Nicotiana tabacum] ! 


I [Cucumis sativus] I 


[Oryza sativa] j 


I [Glycine max] . ! 


I [Ipomoea batatas] I 


[Avena fatua] 


I [Lvcopersicon esculentum] I 


I [Triticum aestivum] 


I [Pimpinella brachycarpa] 


I [Petroselinum crispum] 


I [Nicotiana tabacum] 


I [Cucumis sativus] 


I [Oryza sativa] 


I [Ipomoea batatas] 


I [Avena fatua] 


I [Lvcopersicon esculentum] 


IfPetroselinum crispum] 


I [Avena sativa] 


I [Pimpinella brachycarpa] 


I [Capsella rubella] 


I [Oryza sativa Gaponica cultivar-group)] 


I [Oryza sativa] 


I [Brassica oleracea] 


I [Lvcopersicon esculentum] 


([Medicago truncatula] 


1.00E-33I 


9.10E-37I 


6.30E-36 


1.20E-34I 


1.70E-32 


2.00E-31 1 


2.20E-30I 


4.90E-30 


6.10E-30I 


8.30E-30I 


1.10E-29 


6.20E-90 


i 1.80E-83 


3.50E-63 


1 2.20E-62 


! 3.80E-58 


I 2.00E-56 


7.20E-55 


4.00E-54 


I 2.10E-53I 


I 2.30E-53I 


3.30E-128 


I1.10E-109I 


I 1.50E-74I 


I 1.10E-66 


I 2.30E-63 


4.60E-63 


1.70E-56 


5.00E-56 


I 8.70E-56 


4.20E-22 


CO 
N- 
i 

UJ 

o 

CNJ 

T- 


I 1.90E-73 


1.30E-62 


I 6.50E-55 


I 3.20E-46 


BG448527 I 


qi1 33461 88 


CO 
CO 
LO 

o 

CNJ 


gi485867 I 


qi2605617 I 


gi1 430846 


qi6651292 


gi1 50421 16 


gi82730 ! 


ai51 39806 I 


qi19055 I 


IAF096299 


CUSSLDB 


AF1 93802 | 


IAX192162 | 


IPBSPF1P | 


IAFABF1 


'LES303343 


IAX192164 I 


AF080595 


IPCU48831 


gi4322940 I 


qi927025 I 


qi6689916 


qi484261 


qi1 159877 


gi1 3620227 


qi5917653 


gi4894965 


qi3420906 


ail 36201 68 


AP004457 


AP004693 


BH552835 


IBG128229 


IBG646959 


G681 I 


G681 


G681 


G681 


G681 


G681 


CO 
CO 

CD 


CO 

CO 
CD 


G681 


I G681 


G681 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G878 


G374 


G374 


G374 


G374 


G374 


LO 


CD 
h- 

LO 


CD 
h- 
LO 


CD 
LO 


CD 
h- 

LO 


CD 
h- 
LO 


CD 
LO 


CD 
LO 


CD 
h- 

LO 


CD 
N- 
LO 


CD 
h» 
LO 


CO 


CO 


CO 


CO 


CD 


CD 


CD 


CD 


CD 


CO 


CO 


CD 


CD 


CO 


CD 


CO 


CD 


CD 


CD 


CO 













WO 03/013227 



PCT/US02/25805 
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Traits of interest 

Examples of some of the traits that may he desirable in plants, and that may be 
provided by transforming the plants with the presently disclosed sequences, are listed 
in Table 6. 



Table 6. Genes, traits and utilities that affect plant characteristics 



Trait Category 


Traits 


Transcription factor genes that 
impact traits 


Utility 
Gene effect on: 




Resistance and 
tolerance 


Salt stress resistance 


G22; G196; G226; G303; 
G312; G325; G353; G482; 

UO*fj, uoUl, UroO/, Uoo*+, 

G922; G926; G1452; G1794; 
G1820; G1836; G1843; G1863; 
G2053; G2110; G2140; G2153; 

G2789 


Germination rate, 
survivability, 

yiClU., CALCI1LICU 

growth range 




Osmotic stress 
resistance 


G47; G175; G188; G303; 
G325; G353; G489; G502; 
G526; G921; G922; G926; 
G1069; G1089; G1452; G1794; 
G1930; G2140; G2153; G2379; 
G2701; G2719; G2789; 


Germination rate, 
survivabihty, yield 




Cold stress resistance; 
cold germination 


G256; G394; 

G664;G864;G1322; G2130 


Germination, 
growth, earlier 
planting 




Tolerance to freezing 


G303; G325; G353; G720; 
G912; G913; G1794; G2053; 
G2140; G2153; G2379; G2701; 
G2719; G2789 


Survivability, 
yield, appearance, 
extended range 




Heat stress resistance 


G3; G464; G682; G864; G964; 


Germination, i 
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G1305; G1645; G2130 G2430 


growth, later 
planting 




Drought, low 
humidity resistance 


G303; G325; G353; G720; 
G912; G926; G1452; G1794; 
G1820; G1843; G2053; G2140; 
G2153; G2379; G2583; G2701; 
G2719; G2789 


Survivability, 
yield, extended 
range 




Radiation resistance 


G1052 


Survivability, 
vigor, appearance 




Decreased herbicide 
sensitivity 


G343; G2133; G2517 


Resistant to 
increased 
herbicide use 




Increased herbicide 
sensitivity 


G374; G877;G1519 


Use as a herbicide 
target 




Oxidative stress 

i 


G477; G789; G1807;G2133; 
G2517 


Improved yield, 
appearance, 
reduced 
senescence 




Light response 


G183; G354; G375;G1062; 
G1322; G1331; G1488; G1494; 
G1521; G1786; G1794; G2144; 
G2555; 


Germination, 
growth, 
development, 
flowering time 




Development, 
morphology 


Overall plant 
architecture 


G24;G27; G31;G33;G47 
G147; G156; G160; G182; 
G187; G195; G196;G211; 
G221;G237; G280; G342; 
G352; G357; G358; G360; 
G362; G364; G365; G367; 
G373;G377; G396;G431; 
G447; G479; G546; G546; 
G551;G578; G580; G596; 
G615;G617; G620; G625; 




Vascular tissues, 
lignin content; cell 
wall content; 
appearance 
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G638; G658; G716; G725; 
G727; G730; G740; G770; 
G858; G865; G869; G872; 
G904; G910; G912; G920; 
G939; G963; G977; G979; 
G987; G988; G993; G1007; 



GIOIO; 


G1014; 




vjlU4o, 


G1049; 


P 1 A^l. 

G1062; 


p 1 A/TA. 

G1069; 


p i ATA. 

G1U/U; 


G1076; 


G1089; 


G1093; 


G1127; 


Gll3l; 


G1145; 


p 1 AAA 

G1229; 


G124o; 


G1304; 


G1318; 


p i OOA. 

G132U; 


pi o*3 A- 


G1331; 


G1352; 


G1354; 


p 1 liTA. 


G1364; 


/— 1 1 O 7A. 

G1379; 


G13o4; 




G1415; 


G1417; 


1 a a*"* . 

G1442; 


PI vICJ. 

G1453; 


G1454; 


G1459; 


p i A/rr\, 

G14oU; 


p 1 ytTI . 


G1475; 


G1477; 


/—\ ■* A OH. 

G1487; 


P 1 A Ol. 

G1487; 


G1492; 


G1499; 


G1499; 


pi CI 1 . 

(jrl531; 


G1540; 


G1543; 


G1543; 


p 1 c A A . 

G1344; 


i r /io. 

G1548; 


G1584; 


1 COT. 


p 1 fOft, 


G1589; 


P 1 /TO £1. 

G1636; 


G1642; 


PI n AH* 

CjI /4/; 


Gl/49; 


p 1 7/| A. 

G1749; 


G1751; 


P i nrro . 


G1763; 


G1766; 


G1767; 


p i 770 . 

Gl / /o; 


G1789; 


G1790; 


G1791; 


p 1 nc\i . 

G1793; 


G1794; 


G1795; 


G1800; 


G1806; 


Gl8ll; 


G1835; 


G1836; 


G1838; 


G1839; 


G1843; 


G1853; 


G1855; 


G1865; 


G1881; 


G1882; 


G1883; 


G1884; 


G1891; 


G1896; 


G1898; 


G1902. 


G1904; 


, G1906. 


G1913; 


G1914. 


G1925. 


, G1929 


G1930; 


G1954 


, G1958 


, G1965 


, G1976; 


G2057 


; G2107 


, G2133 


, G2134; 


G2151 


; G2154 


, G2157 


.G2181; 
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G2290; G2299; G2340; G2340; 
G2346; G2373; G2376; G2424; 
G2465; G2505; G2509; G2512; 
G2513; G2519; G2520; G2533; 
G2534; G2573; G2589; G2687; 
G2720; G2787; G2789; G2893 






Size: increased stature 


G189; G1073; G1435; G2430 






Size: reduced stature 
or dwarfism 


G3; G5; G21; G23; G39; G165; 
G184; G194; G258; G280; 
G340; G343; G353; G354; 
G362; G363; G370; G385; 
G396; G439; G440; G447; 
G450; G550; G557; G599; 
G636; G652; G670; G671; 
G674; G729; G760; G804; 
G831;G864;G884; G898; 
G900; G912; G913; G922; 
G932; G937; G939; G960; 
G962;G977;G991;G1000; 
G1008; G1020; G1023; G1053; 
G1067; G1075; G1137; G1181; 
G1198; G1228; G1266; G1267; 
G1275; G1277; G1309; G1311; 
G1314; G1317; G1322; G1323; 
G1326; G1332; G1334; G1367; 
G1381; G1382; G1386; G1421; 
G1488; G1494; G1537; G1545; 
G1560; G1586; G1641; G1652; 
G1655; G1671; G1750; G1756; 
G1757; G1782; G1786; G1794; 
G1839; G1845; G1879; G1886; 
G1888; G1933; G1939; G1943; 
G1944; G2011; G2094; G2115; 


Ornamental; small 
stature provides 
wind resistance; 
creation of dwarf 
varieties 
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G2130; G2132; G2144; G2145; 
G2147; G2156; G2294; G2313; 
G2344; G2431; G2510; G2517; 
G2521;G2893;G2893 






Fruit size and number 


G362 


Biomass, yield, 
cotton boll fiber 
density ! 




Flower structure, 
inflorescence 


G47; G259; G353; G354; 
G671;G732; G988;G1000; 
G1063; G1140; G1326; G1449; 
G1543; G1560; G1587; G1645; 
G1947; G2108; G2143; G2893 


Ornamental 
horticulture; 
production of 
saffron or other 
edible flowers 




dumber and 
development of 
trichomes 


G225; G226; G247; G362; 
G585; G634; G676; G682; 
G1014; G1332; G1452; G1795; 
G2105 


Resistance to pests 
and desiccation; 
essential oil 
production 




Seed size, color, and 
number 


G156; G450; G584; G652; 
G668;G858; G979; G1040; 
G1062; G1145; G1255; G1494; 
G1531; G1534; G1594; G2105; 
G2114; 


Yield 




Root development, 
modifications 


G9; G1482; G1534; G1794; 
G1852; G2053; G2136; G2140 






Modifications to root 
hairs 


G225; G226 


Nutrient, water 
uptake, pathogen 
resistance 




Apical dominance 


G559; G732; G1255; G1275; 
G1411;G1488; G1635;G2452; 
G2509 


Ornamental 
horticulture 




Branching patterns 


G568; G988;G1548 


Ornamental 
horticulture, knot 
reduction, 
improved 
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windscreen 




Leaf shape, color, 
modifications 


G375; G377; G428; G438; 
G447; G464; G557; G577; 
G599; G635; G671; G674; 
G736; G804; G903; G977; 
G921; G922; G1038; G1063; 
G1067; G1073; G1075; G1146; 
G1152; G1198; G1267; G1269; 
G1452; G1484; G1586; G1594; 
G1767; G1786; G1792; G1886; 
G2059; G2094; G2105; G2113; 
G2117; G2143; G2144; G2431; 
G2452; G2465; G2587; G2583; 
G2724; 


Appealing shape 
or shiny leaves for 
omamental 
agriculture, 
increased biomass 
or photosynthesis 




Silique 


G1134 


Ornamental 




Stem morphology 


G47; G438; G671; G748; 
G988; G1000 


Ornamental; 
digestibility 




Shoot modifications 


G390; G391 


Ornamental stem 
bifurcations 




Disease, 

Pathogen 

Resistance 


Bacterial 


G211;G347;G367;G418; 
G525; G545; G578; G1049 


Yield, appearance, 
survivability, 
extended range 




Fungal 


G19; G28; G28; G28; G147; 
G188;G207;G211;G237; 
G248; G278; G347; G367; 
G371;G378;G409;G477; 
G545; G545; G558; G569; 
G578;G591;G594;G616; 
G789; G805; G812; G865; 
G869; G872; G881; G896; 
G940; G1047; G1049; G1064; 
G1084; Gl 196; G1255; G1266; 


Yield, appearance, 
survivability, 
extended range 
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G1363; G1514; G1756; G1792; 
G1792; G1792; G1792; G1880; 
G1919; G1919; G1927; G1927; 
G1936; G1936; G1950; G2069; 
G2130; G2380; G2380; G2555 






Nutrients 


Increased tolerance to 
nitrogen-limited soils 


G225;G226; G1792 






Increased tolerance to 

phosphate-limited 

soils 


G419; G545;G561;G1946 






Increased tolerance to 

potassium-limited 

soils 


G561;G911 






Hormonal 


Hormone sensitivity 


G12; G546; G926; G760; 
G913; G926; G1062; G1069; 
G1095;G1134; G1330; G1452; 
G1666; G1820; G2140; G2789 


Seed dormancy, 
drought tolerance; 
plant form, fruit 
ripening 




Seed 

biochemistry 


Production of seed 
prenyl lipids, 
including tocopherol 


G214; G259; G490; G652; 
G748; G883;G1052; G1328; 
G1930; G2509; G2520 


Antioxidant 
activity, vitamin E 




Production of seed 
sterols 


G20 


Precursors for 
human steroid 
hormones; 
cholesterol 
modulators 




Production of seed 
glucosinolates 


G353; G484; G674; G1272; 
G1506; G1897; G1946; G2113; 
G2117; G2155; G2290; G2340 


Defense against 
insects; putative 
anticancer 
activity; 
undesirable in 
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animal feeds 




Modified seed oil 
content 


G162; G162; G180; G192; 
G241;G265;G286;G291; 
G427; G509; G519; G561; 
G567; G590; G818; G849; 
G892; G961;G974; G1063; 
G1143;G1190; G1198; G1226; 
G1229; G1323; G1451; G1471; 
G1478; G1496; G1526; G1543; 
G1640; G1644; G1646; G1672; 
G1677; G1750; G1765; G1777; 
G1793; G1838; G1902; G1946; 
G1948; G2059; G2123; G2138; 
G2139; G2343; G2792; G2830 


Vegetable oil 
production; 
increased caloric 
value for animal 
feeds; lutein 
content 




Modified seed oil 
composition 


G217; G504; G622; G778; 
G791;G861;G869; G938; 
G965;G1417;G2192 


Heat stability, 
digestibility of 
seed, oils 




Modified seed protein 
content 

0 


G162; G226; G241;G371; 
G427; G509; G567; G597; 
G732; G849; G865; G892; 
G963; G988; G1323; G1323; 
G1419; G1478; G1488; G1634; 
G1637; G1641; G1644; G1652; 
G1677; G1777; G1777; G1818; 
G1820; G1903; G1909; G1946; 
G1946; G1958; G2059; G2117; 
G2417; G2509 


Reduced caloric 
value for humans 










Leaf 

biochemistry 


Production of 
flavonoids 


G1666* 


Ornamental 
pigment 
production; 
pathogen 
resistance; health 
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benefits ! 




Production of leaf 
glucosinolates 


G264; G353; G484; G652; 
G674; G681;G1069; G1198; 
G1322; G1421; G1657; G1794; 
G1897; G1946; G2115;G2117; 
G2144; G2155; G2155; G2340; 
G2512;G2520; G2552 


Defense against 
insects; putative 
anticancer 
activity; 
undesirable in 
animal feeds 




Production of 
diterpenes 


G229 


Induction of 
enzymes involved 
in alkaloid 
biosynthesis 




Production of 
anthocyanin 


G546 


Ornamental 
pigment 




Production of leaf 
phytosterols, inc. 
stigmastanol, 
campesterol 


G561;G2131;G2424 


Precursors for 
human steroid 
hormones; 
cholesterol 
modulators 




Leaf fatty acid 
composition 


G214; G377; G861;G962; 
G975; G987; G1266; G1337; 
G1399; G1465; G1512; G2136; 
G2147; G2192 


Nutritional value; 
increase in waxes 
for disease 
resistance 




Production of leaf 
prenyl lipids, 
including tocopherol 


G214; G259; G280; G652; 
G987; G1543; G2509; G2520 


Antioxidant 
activity, vitamin E 




Biochemistry, 
general 


Production of 
miscellaneous 
secondary metabolites 


G229; G663 






Sugar, starch, 
hemicellulose 
composition, 


G158; G211;G211;G237; 
G242;G274;G598;G1012; 
G1266; G1309; G1309; G1641; 
G1765; G1865; G2094; G2094; 


Food digestibility, 
hemicellulose & 
pectin content; 
fiber content; plant 
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G2589; G2589 


tensile strength, 
wood quality, 
pathogen 
resistance, pulp 
production; tuber 
starch content 




Sugar sensing 


Plant response to 
sugars 


G26; G38;G43;G207;G218; 
G241;G254;G263;G308; 
G536; G567; G567; G680; 
G867; G912; G956; G996; 
G1068; G1225; G1314; G1314; 
G1337; G1759; G1804; G2153; 
G2379 


Photosynthetic 

rate, carbohydrate 

accumulation, 

biomass 

production, 

source-sink 

relationships, 

senescence 




Growth, 
Reproduction 


Plant growth rate and 
development 


G447; G617; G674; G730; 
G917;G937; G1035;G1046; 
G1131; G1425; G1452; G1459; 
G1492; G1589; G1652; G1879; 
G1943; G2430; G2431; G2465; 
G2521 


Faster growth, 
increased biomass 
or yield, improved 
appearance; delay 
in bolting 




Embryo development 


G167 






Seed germination rate 


G979; G1792;G2130 


Yield 




Plant, seedling vigor 


G561; G2346 


Survivability, 
yield 




Senescence; cell death 


G571;G636; G878; G1050; 
G1463; G1749; G1944; G2130; 
G2155;G2340;G2383 


Yield, appearance; 
response to 
pathogens; 




Modified fertility 


G39; G340; G439; G470; 
G559; G615; G652; G671; 
G779; G962; G977; G988; 
G1000; G1063; G1067; G1075; 


Prevents or 
minimizes escape 
of the pollen of 
GMOs 
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G1266; G1311; G1321; G1326; 
G1367; G1386; G1421; G1453; 
G1471; G1453; G1560; G1594; 
G1635; G1750; G1947; G2011; 
G2094; G2113; G2115; G2130; 
G2143; G2147; G2294; G2510; 
G2893 






Early flowering 


G147; G157; G180; G183; 
G183; G184; G185; G208; 
G227; G294; G390; G390; 
G390;G391;G391;G427; 
G427; G490; G565; G590; 
G592; G720; G789; G865; 
G898; G898; G989; G989; 
G1037; G1037; G1142; G1225; 
G1225; G1226; G1242; G1305; 
G1305; G1380; G1380; G1480; 
G1480; G1488; G1494; G1545; 
G1545; G1649; G1706; G1760; 
G1767; G1767; G1820; G1841; 
G1841; G1842; G1843; G1843; 
G1946; G1946; G2010; G2030; 
G2030; G2144; G2144; G2295; 
G2295; G2347; G2348; G2348; 
G2373; G2373; G2509; G2509; 
G2555; G2555 


Faster generation 
time; synchrony of 
flowering; 
potential for 
introducing new 
traits to single 
variety 




Delayed flowering 


G8; G47; G192; G214; G234; 
G361; G362; G562; G568; 
G571; G591; G680; G736; 
G748; G859; G878; G910; 
G912; G913; G971; G994; 
G1051; G1052; G1073; G1079; 
G1335; G1435; G1452; G1478; 


Delayed time to 
pollen production 
of GMO plants; 
synchrony of 
flowering; 
increased yield 
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G1789; G1804; G1865; G1865; 
G1895; G1900; G2007; G2133; 
G2155;G2291;G2465 






Extended flowering 
phase 


G1947 






Flower and leaf 
development 


G259; G353; G377; G580; 
G638 G652; G858; G869; 
G917;G922;G932; G1063; 
G1075; G1140; G1425; G1452; 
G1499; G1548; G1645; G1865; 
G1897; G1933; G2094; G2124; 
G2140; G2143; G2535; G2557 


Ornamental 
applications; 
decreased fertility 




Flower abscission 


G1897 


Ornamental: 
longer retention of 
flowers 



* When co-expressed with G669 and G663 



Significance of modified plant traits 

Currently, the existence of a series of maturity groups for different latitudes 
represents a major barrier to the introduction of new valuable traits. Any trait (e.g. 
disease resistance) has to be bred into each of the different maturity grpups separately, 
a laborious and costly exercise. The availability of single strain, which could be 
grown at any latitude, would therefore greatly increase the potential for introducing 
new traits to crop species such as soybean and cotton. 

For many of the traits, listed in Table 6 and below, that may be conferred to 
plants, a single transcription factor gene may be used to increase or decrease, advance 
or delay, or improve or prove deleterious to a given trait. For example, 
overexpression of a transcription factor gene that naturally occurs in a plant may 
cause early flowering relative to non-transformed or wild-type plants. By knocking 
out the gene, or suppressing the gene (with, for example, antisense suppression) the 
plant may experience delayed flowering. Similarly, overexpressing or suppressing 
one or more genes can impart significant differences in production of plant products, 
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such as different fatty acid ratios. Thus, suppressing a gene that causes a plant to be 
more sensitive to cold may improve a plant's tolerance of cold. 

Salt stress resistance . Soil salinity is one of the more important variables that 
determines where a plant may thrive. Salinity is especially important for the 
successful cultivation of crop plants, particular in many parts of the world that have 
naturally high soil salt concentrations, or where the soil has been over-utilized. Thus, 
presently disclosed transcription factor genes that provide increased salt tolerance 
during germination, the seedling stage, and throughout a plant's life cycle would find 
particular value for imparting survivability and yield in areas where a particular crop 
would not normally prosper. 

Osmotic stress resistance. Presently disclosed transcription factor genes that 
confer resistance to osmotic stress may increase germination rate under adverse 
conditions, which could impact survivability and yield of seeds and plants. 

Cold stress resistance. The potential utility of presently disclosed transcription 
factor genes that increase tolerance to cold is to confer better germination and growth 
in cold conditions. The germination of many crops is very sensitive to cold 
temperatures. Genes that would allow germination and seedling vigor in the cold 
would have highly significant utility in allowing seeds to be planted earlier in the 
season with a high rate of survivability. Transcription factor genes that confer better 
survivability in cooler climates allow a grower to move up planting time in the spring 
and extend the growing season further into autumn for higher crop yields. 

Tolerance to freezing . The presently disclosed transcription factor genes that 
impart tolerance to freezing conditions are useful for enhancing the survivability and 
appearance of plants conditions or conditions that would otherwise cause extensive 
cellular damage. Thus, germination of seeds and survival may take place at 
temperatures significantly below that of the mean temperature required for 
germination of seeds and survival of non-transformed plants. As with salt tolerance, 
this has the added benefit of increasing the potential range of a crop plant into regions 
in which it would otherwise succumb. Cold tolerant transformed plants may also be 
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planted earlier in the spring or later in autumn, with greater success than with non- 
transformed plants. 



Heat stress tolerance . The germination of many crops is also sensitive to high 
temperatures. Presently disclosed transcription factor genes that provide increased 
heat tolerance are generally useful in producing plants that germinate and grow in hot 
conditions, may find particular use for crops that are planted late in the season, or 
extend the range of a plant by allowing growth in relatively hot climates. 

Drought, low humidity tolerance . Strategies that allow plants to survive in 
low water conditions may include, for example, reduced surface area or surface oil or 
wax production. A number of presently disclosed transcription factor genes increase 
a plant's tolerance to low water conditions and provide the benefits of improved 
survivability, increased yield and an extended geographic and temporal planting 
range. 

Radiation resistance . Presently disclosed transcription factor genes have been 
shown to increase lutein production. Lutein, like other xanthophylls such as 
zeaxanthin and violaxanthin, are important in the protection of plants against the 
damaging effects of excessive light. Lutein contributes, directly or indirectly, to the 
rapid rise of non-photochemical quenching in plants exposed to high light. Increased 
tolerance of field plants to visible and ultraviolet light impacts survivability and vigor, 
particularly for recent transplants. Also affected are the yield and appearance of 
harvested plants or plant parts. Crop plants engineered with presently disclosed 
transcription factor genes that cause the plant to produce higher levels of lutein 
therefore would have improved photoprotection, leading to less oxidative damage and 
increase vigor, survivability and higher yields under high light and ultraviolet light 
conditions. 

Decreased herbicide sensitivity. Presently disclosed transcription factor genes 
that confer resistance or tolerance to herbicides (e.g., glyphosate) may find use in 
providing means to increase herbicide applications without detriment to desirable 
plants. This would allow for the increased use of a particular herbicide in a local 
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environment, with the effect of increased detriment to undesirable species and less 
harm to transgenic, desirable cultivars. 

Increased herbicide sensitivity . Knockouts of a number of the presently 
disclosed transcription factor genes have been shown to be lethal to developing 
embryos. Thus, these genes are potentially useful as herbicide targets. 

Oxidative stress . In plants, as in all living things, abiotic and biotic stresses 
induce the formation of oxygen radicals, including superoxide and peroxide radicals. 
This has the effect of accelerating senescence, particularly in leaves, with the resulting 
loss of yield and adverse effect on appearance. Generally, plants that have the highest 
level of defense mechanisms, such as, for example, polyunsaturated moieties of 
membrane lipids, are most likely to thrive under conditions that introduce oxidative 
stress (e.g., high light, ozone, water deficit, particularly in combination). Introduction 
of the presently disclosed transcription factor genes that increase the level of oxidative 
stress defense mechanisms would provide beneficial effects on the yield and 
appearance of plants. One specific oxidizing agent, ozone, has been shown to cause 
significant foliar injury, which impacts yield and appearance of crop and ornamental 
plants. In addition to reduced foliar injury that would be found in ozone resistant 
plant created by transforming plants with some of the presently disclosed transcription 
factor genes, the latter have also been shown to have increased chlorophyll 
fluorescence (Yu-Sen Chang et al. Bot. Bull. Acad. Sin. (2001) 42: 265-272). 

Heavy metal tolerance . Heavy metals such as lead, mercury, arsenic, 
chromium and others may have a significant adverse impact on plant respiration. 
Plants that have been transformed with presently disclosed transcription factor genes 
that confer improved resistance to heavy metals, through, for example, sequestering or 
reduced uptake of the metals will show improved vigor and yield in soils with 
relatively high concentrations of these elements. Conversely, transgenic transcription 
factors may also be introduced into plants to confer an increase in heavy metal uptake, 
which may benefit efforts to clean up contaminated soils. 

Light response . Presently disclosed transcription factor genes that modify a 
plant's response to light may be useful for modifying a plant's growth or 
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development, for example, photomorphogenesis in poor light, or accelerating 
flowering time in response to various light intensities, quality or duration to which a 
non-transformed plant would not similarly respond. Examples of such responses that 
have been demonstrated include leaf number and arrangement, and early flower bud 
appearances. 

Overall plant architecture . Several presently disclosed transcription factor 
genes have been introduced into plants to alter numerous aspects of the plant's 
morphology. For example, it has been demonstrated that a number of transcription 
factors may be used to manipulate branching, such as the means to modify lateral 
branching, a possible application in the forestry industry. Transgenic plants have also 
been produced that have altered cell wall content, lignin production, flower organ 
number, or overall shape of the plants. Presently disclosed transcription factor genes 
transformed into plants may be used to affect plant morphology by increasing or 
decreasing internode distance, both of which may be advantageous under different 
circumstances. For example, for fast growth of woody plants to provide more 
biomass, or fewer knots, increased internode distances are generally desirable. For 
improved wind screening of shrubs or trees, or harvesting characteristics of, for 
example, members of the Gramineae family, decreased internode distance may be 
advantageous. These modifications would also prove useful in the ornamental 
horticulture industry for the creation of unique phenotypic characteristics of 
ornamental plants. 

Increased stature . For some ornamental plants, the ability to provide larger 
varieties may be highly desirable. For many plants, including t fruit-bearing trees or 
trees and shrubs that serve as view or wind screens, increased stature provides 
obvious benefits. Crop species may also produce higher yields on larger cultivars. 

Reduced stature or dwarfism . Presently disclosed transcription factor genes 
that decrease plant stature can be used to produce plants that are more resistant to 
damage by wind and rain, or more resistant to heat or low humidity or water deficit. 
Dwarf plants are also of significant interest to the ornamental horticulture industry, 
and particularly for home garden applications for which space availability may be 
limited. 
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Fruit size and number . Introduction of presently disclosed transcription factor 
genes that affect fruit size will have desirable impacts on fruit size and number, which 
may comprise increases in yield for fruit crops, or reduced fruit yield, such as when 
vegetative growth is preferred (e.g., with bushy ornamentals, or where fruit is 
undesirable, as with ornamental olive trees). 

Flower structure, inflorescence, and development. P resently disclosed 
transgenic transcription factors have been used to create plants with larger flowers or 
arrangements of flowers that are distinct from wild-type or non-transformed cultivars. 
This would likely have the most value for the ornamental horticulture industry, where 
larger flowers or interesting presentations generally are preferred and command the 
highest prices. Flower structure may have advantageous effects on fertility, and could < 
be used, for example, to decrease fertility by the absence, reduction or screening of 
reproductive components. One interesting application for manipulation of flower 
structure, for example, by introduced transcription factors could be in the increased 
production of edible flowers or flower parts, including saffron, which is derived from 
the stigmas of Crocus sativus. 

Number and development of trichomes . Several presently disclosed 
transcription factor genes have been used to modify trichome number and amount of 
trichome products in plants. Trichome glands on the surface of many higher plants 
produce and secrete exudates that give protection from the elements and pests such as 
insects, microbes and herbivores. These exudates may physically immobilize insects 
and spores, may be insecticidal or ant-microbial or they may act as allergens or 
irritants to protect against herbivores. Trichomes have also been suggested to decrease 
transpiration by decreasing leaf surface air flow, and by exuding chemicals that 
protect the leaf from the sun. 

Seed size, color and number . The introduction of presently disclosed 
transcription factor genes into plants that alter the size or number of seeds may have a 
significant impact on yield, both when the product is the seed itself, or when biomass 
of the vegetative portion of the plant is increased by reducing seed production. In the 
case of fruit products, it is often advantageous to modify a plant to have reduced size 
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or number of seeds relative to non-transformed plants to provide seedless or varieties 
with reduced numbers or smaller seeds. Presently disclosed transcription factor genes 
have also been shown to affect seed size, including the development of larger seeds. 
Seed size, in addition to seed coat integrity, thickness and permeability, seed water 
content and by a number of other components including antioxidants and 
oligosaccharides, may affect seed longevity in storage. This would be an important 
utility when the seed of a plant is the harvested crops, as with, for example, peas, 
beans, nuts, etc. Presently disclosed transcription factor genes have also been used to 
modify seed color, which could provide added appeal to a seed product. 

Root development, modifications . By modifying the structure or development 
of roots by transforming into a plant one or more of the presently disclosed 
transcription factor genes, plants may be produced that have the capacity to thrive in 
otherwise unproductive soils. For example, grape roots that extend further into rocky 
soils, or that remain viable in waterlogged soils, would increase the effective planting 
range of the crop. It may be advantageous to manipulate a plant to produce short 
roots, as when a soil in which the plant will be growing is occasionally flooded, or 
when pathogenic fungi or disease-causing nematodes are prevalent. 

Modifications to root hairs . Presently disclosed transcription factor genes that 
increase root hair length or number potentially could be used to increase root growth 
or vigor, which might in turn allow better plant growth under adverse conditions such 
as limited nutrient or water availability. 

Apical dominance . The modified expression of presently disclosed 
transcription factors that control apical dominance could be used in ornamental 
horticulture, for example, to modify plant architecture. 

Branching patterns . Several presently disclosed transcription factor genes have 
been used to manipulate branching, which could provide benefits in the forestry 
industry. For example, reduction in the formation of lateral branches could reduce 
knot formation. Conversely, increasing the number of lateral branches could provide 
utility when a plant is used as a windscreen, or may also provide ornamental 
advantages. 
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Leaf shape, color and modifications . It has been demonstrated in laboratory 
experiments that overexpression of some of the presently disclosed transcription 
factors produced marked effects on leaf development. At early stages of growth, these 
transgenic seedlings developed narrow, upward pointing leaves with long petioles, 
possibly indicating a disruption in circadian-clock controlled processes or nyctinastic 
movements. Other transcription factor genes can be used to increase plant biomass; 
large size would be useful in crops where the vegetative portion of the plant is the 
marketable portion. 

Silicmes . Genes that later silique conformation in brassicates may be used to 
modify fruit ripening processes in brassicates and other plants, which may positively 
affect seed or fruit quality. 

Stem morphology and shoot modifications . Laboratory studies have 
demonstrated that introducing several of the presently disclosed transcription factor 
genes into plants can cause stem bifurcations in shoots, in which the shoot meristems 
split to form two or three separate shoots. This unique appearance would be desirable 
in ornamental applications. 

Diseases, pathogens and pests . A number of the presently disclosed 
transcription factor genes have been shown to or are likely to confer resistance to 
various plant diseases, pathogens and pests. The offending organisms include fungal 
pathogens Fusarium oxysporum, Botrytis cinerea, Sclerotinia sclerotiorum, and 
Erysiphe orontil Bacterial pathogens to which resistance may be conferred include 
Pseudomonas syringae. Other problem organisms may potentially include 
nematodes, mollicutes, parasites, or herbivorous arthropods. In each case, one or 
more transformed transcription factor genes may provide some benefit to the plant to 
help prevent or overcome infestation. The mechanisms by which the transcription 
factors work could include increasing surface waxes or oils, surface thickness, local 
senescence, or the activation of signal transduction pathways that regulate plant 
defense in response to attacks by herbivorous pests (including, for example, protease 
inhibitors). 
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Increased tolerance of plants to nutrient-limited soils . Presently disclosed 
transcription factor genes introduced into plants may provide the means to improve 
uptake of essential nutrients, including nitrogenous compounds, phosphates, 
potassium, and trace minerals. The effect of these modifications is to increase the 
seedling germination and range of ornamental and crop plants. The utilities of 
presently disclosed transcription factor genes conferring tolerance to conditions of 
low nutrients also include cost savings to the grower by reducing the amounts of 
fertilizer needed, environmental benefits of reduced fertilizer runoff; and improved 
yield and stress tolerance. In addition, this gene could be used to alter seed protein 
amounts and/or composition that could impact yield as well as the nutritional value 
and production of various food products. 

Hormone sensitivity . One or more of the presently disclosed transcription 
factor genes have been shown to affect plant abscisic acid (ABA) sensitivity. This 
plant hormone is likely the most important hormone in mediating the adaptation of a 
plant to stress. For example, ABA mediates conversion of apical meristems into 
dormant buds. In response to increasingly cold conditions, the newly developing 
leaves growing above the meristem become converted into stiff bud scales that closely 
wrap the meristem and protect it from mechanical damage during winter. ABA in the 
bud also enforces dormancy; during premature warm spells, the buds are inhibited 
from sprouting. Bud dormancy is eliminated after either a prolonged cold period of 
cold or a significant number of lengthening days. Thus, by affecting ABA sensitivity, 
introduced transcription factor genes may affect cold sensitivity and survivability. 
ABA is also important in protecting plants from drought tolerance. 

Several other of the present transcription factor genes have been used to 
manipulate ethylene signal transduction and response pathways. These genes can thus 
be used to manipulate the processes influenced by ethylene, such as seed germination 
or fruit ripening, and to improve seed or fruit quality. 

Production of seed and leaf prenyl lipids, including tocopherol Prenyl lipids 
play a role in anchoring proteins in membranes or membranous organelles. Thus 
modifying the prenyl lipid content of seeds and leaves could affect membrane 
integrity and function. A number of presently disclosed transcription factor genes 
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have been shown to modify the tocopherol composition of plants. Tocopherols have 
both antioxidant and vitamin E activity. 



Production of seed and leaf phvtosterols : Presently disclosed transcription 
factor genes that modify levels of phytosterols in plants may have at least two 
utilities. First, phytosterols are an important source of precursors for the manufacture 
of human steroid hormones. Thus, regulation of transcription factor expression or 
activity could lead to elevated levels of important human steroid precursors for steroid 
semi-synthesis. For example, transcription factors that cause elevated levels of 
campesterol in leaves, or sitosterols and stigmasterols in seed crops, would be useful 
for this purpose. Phytosterols and their hydrogenated derivatives phytostanols also 
have proven cholesterol-lowering properties, and transcription factor genes that 
modify the expression of these compounds in plants would thus provide health 
benefits. 

Production of seed and leaf glucosinolates . Some glucosinolates have anti- 
cancer activity; thus, increasing the levels or composition of these compounds by 
introducing several of the presently disclosed transcription factors might be of interest 
from a nutraceutical standpoint. (3) Glucosinolates form part of a plants natural 
defense against insects. Modification of glucosinolate composition or quantity could 
therefore afford increased protection from predators. Furthermore, in edible crops, 
tissue specific promoters might be used to ensure that these compounds accumulate 
specifically in tissues, such as the epidermis, which are not taken for consumption. 

Modified seed oil content . The composition of seeds, particularly with respect 
to seed oil amounts and/or composition, is very important for the nutritional value and 
production of various food and feed products. Several of the presently disclosed 
transcription factor genes in seed lipid saturation that alter seed oil content could be 
used to improve the heat stability of oils or to improve the nutritional quality of seed 
oil, by, for example, reducing the number of calories in seed, increasing the number of 
calories in animal feeds, or altering the ratio of saturated to unsaturated lipids 
comprising the oils. 
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Seed and leaf fatty acid composition . A number of the presently disclosed 
transcription factor genes have been shown to alter the fatty acid composition in 
plants, and seeds in particular. This modification may find particular value for 
improving the nutritional value of, for example, seeds or whole plants. Dietary fatty 
acids ratios have been shown to have an effect on, for example, bone integrity and 
remodeling (see, for example, Weiler, H.A., PediatrRes (2000) 47:5 692-697). The 
ratio of dietary fatty acids may alter the precursor pools of long-chain polyunsaturated 
fatty acids that serve as precursors for prostaglandin synthesis. In mammalian 
connective tissue, prostaglandins serve as important signals regulating the balance 
between resorption and formation in bone and cartilage. Thus dietary fatty acid ratios 
altered in seeds may affect the etiology and outcome of bone loss. 

. Modified seed protein content . As with seed oils, the composition of seeds, 
particularly with respect to protein amounts and/or composition, is very important for 
the nutritional value and production of various food and feed products. A number of 
the presently disclosed transcription factor genes modify the protein concentrations in 
seeds would provide nutritional benefits, and maybe used to prolong storage, increase 
seed pest or disease resistance, or modify germination rates. 

Production of flavonoids in leaves and other plant parts . Expression of 
presently disclosed transcription factor genes that increase flavonoid production in 
plants, including anthocyanins and condensed tannins, may be used to alter in pigment 
production for horticultural purposes, and possibly increasing stress resistance. 
Flavonoids have antimicrobial activity and could be used to engineer pathogen 
resistance. Several flavonoid compounds have health promoting effects such as the 
inhibition of tumor growth and cancer, prevention of bone loss and the prevention of 
the oxidation of lipids. Increasing levels of condensed tannins, whose biosynthetic 
pathway is shared with anthocyanin biosynthesis, in forage legumes is an important 
agronomic trait because they prevent pasture bloat by collapsing protein foams within 
the rumen. For a review on the utilities of flavonoids and their derivatives, refer to 
Dixon et al. (1999) Trends Plant Sci. 4:394-400. 

Production of diterpenes in leaves and oth er plant parts . Depending on the 
plant species, varying amounts of diverse secondary biochemicals (often lipophilic 
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teipenes) are produced and exuded or volatilized by trichomes. These exotic 
secondary biochemicals, which are relatively easy to extract because they are on the 
surface of the leaf, have been widely used in such products as flavors and aromas, 
drugs, pesticides and cosmetics. Thus, the overexpression of genes that are used to 
produce diterpenes in plants may be accomplished by introducing transcription factor 
genes that induce said overexpression. One class of secondary metabolites, the 
diterpenes, can effect several biological systems such as tumor progression, 
prostaglandin synthesis and tissue inflammation. In addition, diterpenes can act as 
insect pheromones, termite allomones, and can exhibit neurotoxic, cytotoxic and 
antimitotic activities. As a result of this functional diversity, diterpenes have been the 
target of research several pharmaceutical ventures. In most cases where the metabolic 
pathways are impossible to engineer, increasing trichome density or size on leaves 
may be the only way to increase plant productivity. 

Production of anthocvanin in leaves and other plant parts . Several presently 
disclosed transcription factor genes can be used to alter anthocyanin production in 
numerous plant species. The potential utilities of these genes include alterations in 
pigment production for horticultural purposes, and possibly increasing stress 
resistance in combination with another transcription factor. 

Production of miscellaneous secondary metabolites . Microarray data suggests 
that flux through the aromatic amino acid biosynthetic pathways and primary and 
secondary metabolite biosynthetic pathways are up-regulated. Presently disclosed 
transcription factors have been shown to be involved in regulating alkaloid 
biosynthesis, in part by up-regulating the enzymes indole-3-glycerol phosphatase and 
strictosidine synthase. Phenylalanine ammonia lyase, chalcone synthase and trans- 
cinnamate mono-oxygenase are also induced, and are involved in phenylpropenoid 
biosynthesis. 

Sugar, starch, hemicellulose composition . Overexpression of the presently 
disclosed transcription factors that affect sugar content resulted in plants with altered 
leaf insoluble sugar content. Transcription factors that alter plant cell wall 
composition have several potential applications including altering food digestibility, 
plant tensile strength, wood quality, pathogen resistance and in pulp production. The 
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potential utilities of a gene involved in glucose-specific sugar sensing are to alter 
energy balance, photosynthetic rate, carbohydrate accumulation, biomass production, 
source-sink relationships, and senescence. 

Hemicellulose is not desirable in paper pulps because of its lack of strength 
compared with cellulose. Thus modulating the amounts of cellulose vs. hemicellulose 
in the plant cell wall is desirable for the paper/lumber industry. Increasing the 
insoluble carbohydrate content in various fruits, vegetables, and other edible 
consumer products will result in enhanced fiber content. Increased fiber content 
would not only provide health benefits in food products, but might also increase 
digestibility of forage crops. In addition, the hemicellulose and pectin content of fruits 
and berries affects the quality of jam and catsup made from them. Changes in 
hemicellulose and pectin content could result in a superior consumer product. 

Plant response to sugars and sugar composition . In addition to their important 
role as an energy source and structural component of the plant cell, sugars are central 
regulatory molecules that control several aspects of plant physiology, metabolism and 
development. It is thought that this control is achieved by regulating gene expression 
and, in higher plants, sugars have been shown to repress or activate plant genes 
involved in many essential processes such as photosynthesis, glyoxylate metabolism, 
respiration, starch and sucrose synthesis and degradation, pathogen response, 
wounding response, cell cycle regulation, pigmentation, flowering and senescence. 
The mechanisms by which sugars control gene expression are not understood. 

Because sugars are important signaling molecules, the ability to control either 
the concentration of a signaling sugar or how the plant perceives or responds to a 
signaling sugar could be used to control plant development, physiology or 
metabolism. For example, the flux of sucrose (a disaccharide sugar used for 
systemically transporting carbon and energy in most plants) has been shown to affect 
gene expression and alter storage compound accumulation in seeds. Manipulation of 
the sucrose signaling pathway in seeds may therefore cause seeds to have more 
protein, oil or carbohydrate, depending on the type of manipulation. Similarly, in 
tubers, sucrose is converted to starch which is used as an energy store. It is thought 
that sugar signaling pathways may partially determine the levels of starch synthesized 
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in the tubers. The manipulation of sugar signaling in tubers could lead to tubers with a 
higher starch content. 

Thus, the presently disclosed transcription factor genes that manipulate the 
sugar signal transduction pathway may lead to altered gene expression to produce 
plants with desirable traits. In particular, manipulation of sugar signal transduction 
pathways could be used to alter source-sink relationships in seeds, tubers, roots and 
other storage organs leading to increase in yield. 

Plant growth rate and development . A number of the presently disclosed 
transcription factor genes have been shown to have significant effects on plant growth 
rate and development. These observations have included, for example, more rapid or 
delayed growth and development of reproductive organs. This would provide utility 
for regions with short or long growing seasons, respectively. Accelerating plant 
: . growth would also improve early yield or increase biomass at an earlier stage, when 
such is desirable (for example, in producing forestry products). 

Embrvo development . Presently disclosed transcription factor genes that alter 
embryo development has been used to alter seed protein and oil amounts and/or 
composition which is very important for the nutritional value and production of 
various food products. Seed shape and seed coat may also be altered by these genes, 
which may provide for improved storage stability. 

Seed germination rate . A number of the presently disclosed transcription 
factor genes have been shown to modify seed germination rate, including when the 
seeds are in conditions normally unfavorable for germination (e.g., cold, heat or salt 
stress, or in the presence of ABA), and may thus be used to modify and improve 
germination rates under adverse conditions. 

Plant, seedling vigor . Seedlings transformed with presently disclosed 
transcription factors have been shown to possess larger cotyledons and appeared 
somewhat more advanced than control plants. This indicates that the seedlings 
developed more rapidly that the control plants. Rapid seedling development is likely 
to reduce loss due to diseases particularly prevalent at the seedling stage (e.g., 
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damping off) and is thus important for survivability of plants germinating in the field 
or in controlled environments. 



Senescence, cell death . Presently disclosed transcription factor genes may be 
used to alter senescence responses in plants. Although leaf senescence is thought to be 
an evolutionary adaptation to recycle nutrients, the ability to control senescence in an 
agricultural setting has significant value. For example, a delay in leaf senescence in 
some maize hybrids is associated with a significant increase in yields and a delay of a 
few days in the senescence of soybean plants can have a large impact on yield. 
Delayed flower senescence may also generate plants that retain their blossoms longer 
and this may be of potential interest to the ornamental horticulture industry. 

Modified fertility . Plants that overexpress a number of the presently disclosed 
transcription factor genes have been shown to possess reduced fertility. This could 
be a desirable trait, as it could be exploited to prevent or minimize the escape of the 
pollen of genetically modified organisms (GMOs) into the environment. 

Early and delayed flowering . Presently disclosed transcription factor genes 
that accelerate flowering could have valuable applications in such programs since 
they allow much faster generation times. In a number of species, for example, 
broccoli, cauliflower, where the reproductive parts of the plants constitute the crop 
and the vegetative tissues are discarded, it would be advantageous to accelerate time 
to flowering. Accelerating flowering could shorten crop and tree breeding programs. 
Additionally, in some instances, a faster generation time might allow additional 
harvests of a crop to be made within a given growing season. A number of 
Arabidopsis genes have already been shown to accelerate flowering when 
constitutively expressed. These include LEAFY, APETALA1 and CONSTANS 
(Mandel, M. et al, 1995, Nature 377, 522-524; Weigel, D. andNilsson, O., 1995, 
Nature 377, 495-500; Simon et al., 1996, Nature 384, 59-62). 

By regulating the expression of potential flowering using inducible promoters, 
flowering could be triggered by application of an inducer chemical. This would allow 
flowering to be synchronized across a crop and facilitate more efficient harvesting. 
Such inducible systems could also be used to tune the flowering of crop varieties to 
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different latitudes. At present, species such as soybean and cotton are available as a 
series of maturity groups that are suitable for different latitudes on the basis of their 
flowering time (which is governed by day-length). A system in which flowering could 
be chemically controlled would allow a single high-yielding northern maturity group 
to be grown at any latitude. In southern regions such plants could be grown for longer, 
thereby increasing yields, before flowering was induced. In more northern areas, the 
induction would be used to ensure that the crop flowers prior to the first winter frosts. 

In a sizeable number of species, for example, root crops, where the vegetative 
parts of the plants constitute the crop and the reproductive tissues are discarded, it 
would be advantageous to delay or prevent flowering. Extending vegetative 
development with presently disclosed transcription factor genes could thus bring 
about large increases in yields.. Prevention of flowering might help maximize 
vegetative yields and prevent escape of genetically modified organism (GMO) pollen. 

Extended flowering phase . Presently disclosed transcription factors that extend 
flowering time have utility in engineering plants with longer-lasting flowers for the 
horticulture industry, and for extending the time in which the plant is fertile. 

Flower and leaf development . Presently disclosed transcription factor genes 
have been used to modify the development of flowers and leaves. This could be 
advantageous in the development of new ornamental cultivars that present unique 
configurations. In addition, some of these genes have been shown to reduce a plant's 
fertility, which is also useful for helping to prevent development of pollen of GMOs. 

Flower abscission . Presently disclosed transcription factor genes introduced 
into plants have been used to retain flowers for longer periods. This would provide a 
significant benefit to the ornamental industry, for both cut flowers and woody plant 
varieties (of, for example, maize), as well as have the potential to lengthen the fertile 
period of a plant, which could positively impact yield and breeding programs. 
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A listing of specific effects and utilities that the presently disclosed 
transcription factor genes have on plants, as determined by direct observation and 
assay analysis, is provided in Table 4. 

XVI. Antisense and Co-suppression 

In addition to expression of the nucleic acids of the invention as gene 
replacement or plant phenotype modification nucleic acids, the nucleic acids are also 
useful for sense and anti-sense suppression of expression, e.g., to down-regulate 
expression of a nucleic acid of the invention, e.g., as a further mechanism for 
modulating plant phenotype. That is, the nucleic acids of the invention, or 
subsequences or anti-sense sequences thereof, can be used to block expression of 
naturally occurring homologous nucleic acids. A variety of sense and anti-sense 
technologies are known in the art, e.g., as set forth in Lichtenstein and Nellen (1997) 
Antisense Technology: A Practical Approach ERL Press at Oxford University Press, 
Oxford, U.K.. In general, sense or anti-sense sequences are introduced into a cell, 
where they are optionally amplified, e.gi, by transcription. Such sequences include 
both simple oligonucleotide sequences and catalytic sequences such as ribozymes. 

For example, a reduction or elimination of expression (i.e., a "knock-out") of a 
transcription factor or transcription factor homologue polypeptide in a transgenic 
plant, e.g., to modify a plant trait, can be obtained by introducing an antisense construct 
corresponding to the polypeptide of interest as a cDNA. For antisense suppression, the 
transcription factor or homologue cDNA is arranged in reverse orientation (with 
respect to the coding sequence) relative to the promoter sequence in the expression 
vector. The introduced sequence need not be the full length cDNA or gene, and need 
not be identical to the cDNA or gene found in the plant type to be transformed. 
Typically, the antisense sequence need only be capable of hybridizing to the target 
gene or RNA of interest. Thus, where the introduced sequence is of shorter length, a 
higher degree of homology to the endogenous transcription factor sequence will be 
needed for effective antisense suppression. While antisense sequences of various 
lengths can be utilized, preferably, the introduced antisense sequence in the vector 
will be at least 30 nucleotides in length, and improved antisense suppression will 
typically be observed as the length of the antisense sequence increases. Preferably, 
the length of the antisense sequence in the vector will be greater than 100 nucleotides. 
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Transcription of an antisense construct as described results in the production of RNA 
molecules that are the reverse complement of mRNA molecules transcribed from the 
endogenous transcription factor gene in the plant cell. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using a ribozyme. Ribozymes are RNA molecules that possess highly 
specific endoribonuclease activity. The production and use of ribozymes are 
disclosed in U.S. Patent No. 4,987,071 and U.S. Patent No. 5,543,508. Synthetic 
ribozyme sequences including antisense RNAs can be used to confer RNA cleaving 
activity on the antisense RNA, such that endogenous mRNA molecules that hybridize 
to the antisense RNA are cleaved, which in turn leads to an enhanced antisense 
inhibition of endogenous gene expression. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using RNA interference , or RNAi. RNAi is a post-transcriptional, targeted 
gene-silencing technique that uses double-stranded RNA (dsRNA) to incite 
degradation of messenger RNA (mRNA) containing the same sequence as the dsRNA 
(Constans, (2002; The Scientist 16:36). Small interfering RNAs, or siRNAs are 
produced in at least two steps: an endogenous ribonuclease cleaves longer dsRNA 
into shorter, 21-23 nucleotide-long RNAs. The siRNA segments then mediate the 
degradation of the target mRNA (Zamore, (2001) Nature Struct Biol, 8:746-50). 
RNAi has been used for gene function determination in a manner similar to antisense 
oligonucleotides (Constans, (2002) Tfie Scientist 16:36). Expression vectors that 
continually express siRNAs in transiently and stably transfected have been engineered 
to express small hairpin RNAs (shRNAs), which get processed in vivo into siRNAs- 
like molecules capable of carrying out gene-specific silencing (Brummelkamp et al., 
(2002) Science 296:550-553, and Paddison, et al. (2002) Genes & Dev. 16:948-958). 
Post-transcriptional gene silencing by double-stranded RNA is discussed in further 
detail by Hammond et al. (2001) Nature Rev Gen 2:110-119, Fire et al. (1998) Nature 
391 : 806-81 1 and Timmons and Fire (1998) Nature 395: 854. 

Vectors in which RNA encoded by a transcription factor or transcription factor 
homologue cDNA is over-expressed can also be used to obtain co-suppression of a 
corresponding endogenous gene, e.g., in the manner described in U.S. Patent No. 
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5,231,020 to Jorgensen. Such co-suppression (also termed sense suppression) does 
not require that the entire transcription factor cDNA be introduced into the plant cells, 
nor does it require that the introduced sequence be exactly identical to the endogenous 
transcription factor gene of interest. However, as with antisense suppression, the 
suppressive efficiency will be enhanced as specificity of hybridization is increased, 
e.g., as the introduced sequence is lengthened, and/or as the sequence similarity 
between the introduced sequence and the endogenous transcription factor gene is 
increased. 

Vectors expressing an untranslatable form of the transcription factor mRNA, 
e.g., sequences comprising one or more stop codon, or nonsense mutation) can also be 
used to suppress expression of an endogenous transcription factor, thereby reducing or 
eliminating it's activity and modifying one or more traits. Methods for producing 
such constructs are described in U.S. Patent No. 5,583,021. Preferably, such 
constructs are made by introducing a premature stop codon into the transcription 
factor gene. Alternatively, a plant trait can be modified by gene silencing using 
double-strand RNA (Sharp (1999) Genes and Development 13: 1 39-141 ).Another 
method for abolishing the expression of a gene is by insertion mutagenesis using the 
T-DNA ofAgrobacterium tumefaciens. After generating the insertion mutants, the 
mutants can be screened to identify those containing the insertion in a transcription 
factor or transcription factor homologue gene. Plants containing a single transgene 
insertion event at the desired gene can be crossed to generate homozygous plants for 
the mutation. Such methods are well known to those of skill in the art. (See for 
example Koncz et al. (1992) Methods in Arabidopsis Research, World Scientific.) 

Alternatively, a plant phenotype can be altered by eliminating an endogenous 
gene, such as a transcription factor or transcription factor homologue, e.g., by 
homologous recombination (Kempin et al. (1997) Nature 389:802-803). 

A plant trait can also be modified by using the Cre-lox system (for example, as 
described in US Pat. No. 5,658,772). A plant genome can be modified to include 
first and second lox sites that are then contacted with a Cre recombinase. If the lox 
sites are in the same orientation, the intervening DNA sequence between the two sites 
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is excised. If the lox sites are in the opposite-orientation, the intervening sequence is 
inverted. 

The polynucleotides and polypeptides of this invention can also be expressed 
in a plant in the absence of an expression cassette by manipulating the activity or 
expression level of the endogenous gene by other means. For example, by 
ectopically expressing a gene by T-DNA activation tagging (Ichikawa et al. (1997) 
Nature 390 698-701; Kakimoto et al. (1996) Science 274: 982-985). This method 
entails transforming a plant with a gene tag containing multiple transcriptional 
enhancers and once the tag has inserted into the genome, expression of a flanking 
gene coding sequence becomes deregulated. In another example, the transcriptional 
machinery in a plant can be modified so as to increase transcription levels of a 
polynucleotide of the invention (See, e.g., PCT Publications WO 96/06166 and WO 
98/53057 which describe the modification of the DNA-binding specificity of zinc 
finger proteins by changing particular amino acids in the DNA-binding motif). 

The transgenic plant can also include the machinery necessary for expressing 
or altering the activity of a polypeptide encoded by an endogenous gene, for example 
by altering the phosphorylation state of the polypeptide to maintain it in an activated 
state. 

Transgenic plants (or plant cells, or plant explants, or plant tissues) 
incorporating the polynucleotides of the invention and/or expressing the polypeptides 
of the invention can be produced by a variety of well established techniques as 
described above. Following construction of a vector, most typically an expression 
cassette, including a polynucleotide, e.g., encoding a transcription factor or 
transcription factor homologue, of the invention, standard techniques can be used to 
introduce the polynucleotide into a plant, a plant cell, a plant explant or a plant tissue 
of interest. Optionally, the plant cell, explant or tissue can be regenerated to produce 
a transgenic plant. 

The plant can be any higher plant, including gymnosperms, 
monocotyledonous and dicotyledenous plants. Suitable protocols are available for 
Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae (carrot, celery, parsnip), 
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Cruciferae (cabbage, radish, rapeseed, broccoli, etc.), Curcurbitaceae (melons and 
cucumber), Gramineae (wheat, corn, rice, barley, millet, etc.), Solanaceae (potato, 
tomato, tobacco, peppers, etc.), and various other crops. See protocols described in 
Ammirato et al. (1984) Handbook of Plant Cell Culture -Crop Species. Macmillan 
Publ. Co. Shimamoto et al. (1989) Nature 338:274-276; Fromm et al. (1990) 
Bio/Technology 8:833-839; and Vasil et al. (1990) Bio/Technology 8:429-434. 

Transformation and regeneration of both monocotyledonous and 
dicotyledonous plant cells is now routine, and the selection of the most appropriate 
transformation technique will be determined by the practitioner. The choice of 
method will vary with the type of plant to be transformed; those skilled in the art will 
recognize the suitability of particular methods for given plant types. Suitable methods 
can include, but are not limited to: electroporation of plant protoplasts; liposome- 
mediated transformation; polyethylene glycol (PEG) mediated transformation; 
transformation using viruses; micro-injection of plant cells; micro-projectile 
bombardment of plant cells; vacuum infiltration; and Agrobacterium tumefaciens 
mediated transformation. Transformation means introducing a nucleotide sequence 
into a plant in a manner to cause stable or transient expression of the sequence. 

Successful examples of the modification of plant characteristics by 
transformation with cloned sequences which serve to illustrate the current knowledge 
in this field of technology, and which are herein incorporated by reference, include: 
U.S. Patent Nos. 5,571,706; 5,677,175; 5,510,471; 5,750,386; 5,597,945; 5,589,615; 
5,750,871; 5,268,526; 5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042. 

Following transformation, plants are preferably selected using a dominant 
selectable marker incorporated into the transformation vector. Typically, such a 
marker will confer antibiotic or herbicide resistance on the transformed plants, and 
selection of transformants can be accomplished by exposing the plants to appropriate 
concentrations of the antibiotic or herbicide. 

After transformed plants are selected and grown to maturity, those plants 
showing a modified trait are identified. The modified trait can be any of those traits 
described above. Additionally, to confirm that the modified trait is due to changes in 
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expression levels or activity of the polypeptide or polynucleotide of the invention can 
be determined by analyzing mRNA expression using Northern blots, RT-PCR or 
microarrays, or protein expression using immunoblots or Western blots or gel shift 
assays. 

XVII. Integrated Systems - Sequence Identity 

Additionally, the present invention may be an integrated system, computer or 
computer readable medium that comprises an instruction set for determining the 
identity of one or more sequences in a database. In addition, the instruction set can be 
used to generate or identify sequences that meet any specified criteria. Furthermore, 
the instruction set may be used to associate or link certain functional benefits, such 
improved characteristics, with one or more identified sequence. 

For example, the instruction set can include, e.g., a sequence comparison or 
other alignment program, e.g., an available program such as, for example, the 
Wisconsin Package Version 10.0, such as BLAST, FASTA, PILEUP, 
FINDPATTERNS or the like (GCG, Madison, WI). Public sequence databases such 
as GenBank, EMBL, Swiss-Prot and PIR or private sequence databases such as 
PHYTOSEQ sequence database (Incyte Genomics, Palo Alto, CA) can be searched. 

Alignment of sequences for comparison can be conducted by the local 
homology algorithm of Smith and Waterman (1981) Adv. Atrol. Math. 2:482, by the 
homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 
48:443-453, by the search for similarity method of Pearson and Lipman (1988) Proc. 
Natl. Acad. Sci. U.S. A 85:2444-2448, by computerized implementations of these 
algorithms. After alignment, sequence comparisons between two (or more) 
polynucleotides or polypeptides are typically performed by comparing sequences of 
the two sequences over a comparison window to identify and compare local regions 
of sequence similarity. The comparison window can be a segment of at least about 20 
contiguous positions, usually about 50 to about 200, more usually about 100 to about 
150 contiguous positions. A description of the method is provided in Ausubel et al., 
supra. 
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A variety of methods for determining sequence relationships can be used, 
including manual alignment and computer assisted sequence alignment and analysis. 
This later approach is a preferred approach in the present invention, due to the 
increased throughput afforded by computer assisted methods. As noted above, a 
variety of computer programs for performing sequence alignment are available, or can 
be produced by one of skill. 

One example algorithm that is suitable for determining percent sequence 
identity and sequence similarity is the BLAST algorithm, which is described in 
Altschul et al. J. Mol. Biol 215:403-410 (1990). Software for performing BLAST 
analyses is publicly available, e.g., through the National Center for Biotechnology 
Information (see internet website at ncbi.nlm.nih.gov). This algorithm involves first 
identifying high scoring sequence pairs (HSPs) by identifying short words of length 
W in the query sequence, which either match or satisfy some positive- valued 
threshold score T when aligned with a word of the same length in a database 
sequence. T is referred to as the neighborhood word score threshold (Altschul et al., 
supra). These initial neighborhood word hits act as seeds for initiating searches to 
find longer HSPs containing them. The word hits are then extended in both directions 
along each sequence for as far as the cumulative alignment score can be increased. 
Cumulative scores are calculated using, for nucleotide sequences, the parameters M 
(reward score for a pair of matching residues; always > 0) and N (penalty score for 
mismatching residues; always < 0). For amino acid sequences, a scoring matrix is 
used to calculate the cumulative score. Extension of the word hits in each direction 
are halted when: the cumulative alignment score falls off by the quantity X from its 
maximum achieved value; the cumulative score goes to zero or below, due to the 
accumulation of one or more negative-scoring residue alignments; or the end of either 
sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide 
sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, a cutoff 
of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the 
BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, 
and the BLOSUM62 scoring matrix (see Henikoff & Henikoff ( 1989 ) Proc. Natl. 
Acad. Sci. USA 89:10915). Unless otherwise indicated, "sequence identity" here 
refers to the % sequence identity generated from a tblastx using the NCBI version of 
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the algorithm at the default settings using gapped alignments with the filter "off' (see, 
for example, internet website at ncbi.nlm.nih.gov). 

In addition to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similarity between two sequences {see, e.g., 
Karlin & Altschul (1993) Proc. Natl Acad. Sci. USA 90:5873-5787). One measure 
of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)X 
which provides an indication of the probability by which a match between two 
nucleotide or amino acid sequences would occur by chance. For example, a nucleic 
acid is considered similar to a reference sequence (and, therefore, in this context, 
homologous) if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, or less than about 0.01, and or even 
less than about 0.001 . An additional example of a useful sequence alignment 
algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of 
related sequences using progressive, pairwise alignments. The program can align, e.g., 
up to 300 sequences of a maximum length of 5,000 letters. 

The integrated system, or computer typically includes a user input interface 
allowing a user to selectively view one or more sequence records corresponding to the 
one or more character strings, as well as an instruction set which aligns the one or 
more character strings with each other or with an additional character string to 
identify one or more region of sequence similarity. The system may include a link of • 
one or more character strings with a particular phenotype or gene ftinction. Typically, 
the system includes a user readable output element that displays an alignment 
produced by the alignment instruction set. 

The methods of this invention can be implemented in a localized or distributed 
computing environment. In a distributed environment, the methods may implemented 
on a single computer comprising multiple processors or on a multiplicity of 
computers. The computers can be linked, e.g. through a common bus, but more 
preferably the computer(s) are nodes on a network. The network can be a generalized 
or a dedicated local or wide-area network and, in certain preferred embodiments, the 
computers may be components of an intra-net or an internet. 
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Thus, the invention provides methods for identifying a sequence similar or 
homologous to one or more polynucleotides as noted herein, or one or more target 
polypeptides encoded by the polynucleotides, or otherwise noted herein and may 
include linking or associating a given plant phenotype or gene function with a 
sequence. In the methods, a sequence database is provided (locally or across an inter 
or intra net) and a query is made against the sequence database using the relevant 
sequences herein and associated plant phenotypes or gene functions. 

Any sequence herein can be entered into the database, before or after querying 
the database. This provides for both expansion of the database, and, if done before the 
querying step, for insertion of control sequences into the database. The control 
sequences can be detected by the query to ensure the general integrity of both the 
database and the query. As noted, the query can be performed using a web browser 
based interface. For example, the database can be a centralized public database such 
as those noted herein, and the querying can be done from a remote terminal or 
computer across an internet or intranet. 

XVTIL Examples 

The following examples are intended to illustrate but not limit the present 
invention. The complete descriptions of the traits associated with each polynucleotide 
of the invention is fully disclosed in Table 4 and Table 6. 

Example I: Full Length Gene Identification and Cloning 

Putative transcription factor sequences (genomic or ESTs) related to known 
transcription factors were identified in the Arabidopsis thaliana GenBank database 
using the tblastn sequence analysis program using default parameters and a P- value 
cutoff threshold of -4 or -5 or lower, depending on the length of the query sequence. 
Putative transcription factor sequence hits were then screened to identify those 
containing particular sequence strings. If the sequence hits contained such sequence 
strings, the sequences were confirmed as transcription factors. 

Alternatively, Arabidopsis thaliana cDNA libraries derived from different 
tissues or treatments, or genomic libraries were screened to identify novel members of 
a transcription family using a low stringency hybridization approach. Probes were 



129 



WO 03/013227 



PCT/US02/25805 



synthesized using gene specific primers in a standard PCR reaction (annealing 
temperature 60° C) and labeled with 32 P dCTP using the High Prime DNA Labeling 
Kit (Boehringer Mannheim). Purified radiolabeled probes were added to filters 
immersed in Church hybridization medium (0.5 M NaP0 4 pH 7.0, 7% SDS, 1 % w/v 
bovine serum albumin) and hybridized overnight at 60°C with shaking. Filters were 
washed two times for 45 to 60 minutes with lxSCC, 1% SDS at 60° C. 

To identify additional sequence 5 ! or 3' of a partial cDNA sequence in a cDNA 
library, 5' and 3' rapid amplification of cDNA ends (RACE) was performed using the 
U.C. Marathon cDNA amplification kit (Clontech, Palo Alto, CA). Generally, the 
method entailed first isolating poly(A) mRNA, performing first and second strand 
cDNA synthesis to generate double stranded cDNA, blunting cDNA ends, followed 
by ligation of the U.C. Marathon Adaptor to the cDNA to form a library of adaptor- 
ligated ds cDNA. 

Gene-specific primers were designed to be used along with adaptor specific 
primers for both 5 f and 3 1 RACE reactions. Nested primers, rather than single 
primers, were used to increase PCR specificity. Using 5' and 3' RACE reactions, 5' 
and 3' RACE fragments were obtained, sequenced and cloned. The process can be 
repeated until 5' and 3' ends of the full-length gene were identified. Then the full- 
length cDNA was generated by PCR using primers specific to 5' and 3' ends of the 
gene by end-to-end PCR. 

Example II: Construction of Expression Vectors 

The sequence was amplified from a genomic or cDNA library using primers 
specific to sequences upstream and downstream of the coding region. The expression 
vector was pMEN20 or pMEN65, which are both derived from pMON316 (Sanders et 
al, fl987 > Nucleic Acids Research 15:1543-1558) and contain the CaMV 35S 
promoter to express transgenes. To clone the sequence into the vector, both pMEN20 
and the amplified DNA fragment were digested separately with Sail and NotI 
restriction enzymes at 37° C for 2 hours. The digestion products were subject to 
electrophoresis in a 0.8% agarose gel and visualized by ethidium bromide staining. 
The DNA fragments containing the sequence and the linearized plasmid were excised 
and purified by using a Qiaquick gel extraction kit (Qiagen, Valencia CA). The 
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fragments of interest were ligated at a ratio of 3:1 (vector to insert). Ligation 
reactions using T4 DNA ligase (New England Biolabs, Beverly MA) were carried out 
at 16° C for 16 hours. The ligated DNAs were transformed into competent cells of the 
E. coli strain DHSalpha by using the heat shock method. The transformations were 
plated on LB plates containing 50 mg/1 kanamycin (Sigma, St. Louis, MO). 
Individual colonies were grown overnight in five milliliters of LB broth containing 50 
mg/1 kanamycin at 37° C. Plasmid DNA was purified by using Qiaquick Mini Prep 
kits (Qiagen). 

Example III: Transformation of Agrobacterium with the Expression Vector 

After the plasmid vector containing the gene was constructed, the vector was 
used to transform Agrobacterium tumefaciens cells expressing the gene products. The 
stock of Agrobacterium tumefaciens cells for transformation were made as described 
bvNagel et al. (1990^) FEMS Microbiol Letts . 67:325-328. Agrobacterium strain 
ABI was grown in 250 ml LB medium (Sigma) overnight at 28°C with shaking until 
an absorbance (A6oo) of 0.5 - 1 .0 was reached. Cells were harvested by centrifugation 
at 4,000 x g for 1 5 min at 4° C. Cells were then resuspended in 250 ^1 chilled buffer 
(1 mM HEPES, pH adjusted to 7.0 with KOH). Cells were centrifuged again as 
described above and resuspended in 125 \xl chilled buffer. Cells were then 
centrifuged and resuspended two more times in the same HEPES buffer as described 
above at a volume of 100 \il and 750 jxl, respectively. Resuspended cells were then 
distributed into 40 jil aliquots, quickly frozen in liquid nitrogen, and stored at -80° C. 

Agrobacterium cells were transformed with plasmids prepared as described 
above following the protocol described by Nagel et al. For each DNA construct to be 
transformed, 50-100 ng DNA (generally resuspended in 10 mM Tris-HCl, 1 mM 
EDTA, pH 8.0) was mixed with 40 jxl of Agrobacterium cells. The DNA/cell mixture 
was then transferred to a chilled cuvette with a 2mm electrode gap and subject to a 2.5 
kV charge dissipated at 25 nF and 200 [iF using a Gene Pulser II apparatus (Bio-Rad, 
Hercules, CA). After electroporation, cells were immediately resuspended in 1 .0 ml 
LB and allowed to recover without antibiotic selection for 2 - 4 hours at 28° C in a 
shaking incubator. After recovery, cells were plated onto selective medium of LB 
broth containing 100 ng/ml spectinomycin (Sigma) and incubated for 24-48 hours at 
28° C. Single colonies were then picked and inoculated in fresh medium. The 
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presence of the plasmid construct was verified by PCR amplification and sequence 
analysis. 

Example IV: Transformation of Arabidopsis Plants with Agrobacterium 
tumefaciens with Expression Vector 

After transformation of Agrobacterium tumefaciens with plasmid vectors 
containing the gene, single Agrobacterium colonies were identified, propagated, and 
used to transform Arabidopsis plants. Briefly, 500 ml cultures of LB medium 
containing 50 mg/1 kanamycin were inoculated with the colonies and grown at 28° C 
with shaking for 2 days until an optical absorbance at 600 nm wavelength over 1 cm 
(Asoo) of > 2.0 is reached. Cells were then harvested by centrifugation at 4,000 x g for 
10 min, and resuspended in infiltration medium (1/2 X Murashige and Skoog salts 
(Sigma), 1 X Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose (Sigma), 0.044 
|uM benzylamino purine (Sigma), 200 jxl/lSilwet L-77 (Lehle Seeds) until an Aeoo of 
0.8 was reached. 

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia) were 
sown at a density of -10 plants per 4" pot onto Pro-Mix BX potting medium 
(Hummert International) covered with fiberglass mesh (18 mm X 16 mm). Plants 
were grown under continuous illumination (50-75 ^E/m 2 /sec) at 22-23° C with 65- 
70% relative humidity. After about 4 weeks, primary inflorescence stems (bolts) are 
cut off to encourage growth of multiple secondary bolts. After flowering of the 
mature secondary bolts, plants were prepared for transformation by removal of all 
siliques and opened flowers. 

The pots were then immersed upside down in the mixture of Agrobacterium 
infiltration medium as described above for 30 sec, and placed on their sides to allow 
draining into a V x 2' flat surface covered with plastic wrap. After 24 h, the plastic 
wrap was removed and pots are turned upright. The immersion procedure was 
repeated one week later, for a total of two immersions per pot. Seeds were then 
collected from each transformation pot and analyzed following the protocol described 
below. 
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Example V: Identification of Arabidopsis Primary Transformants 

Seeds collected from the transformation pots were sterilized essentially as 
follows. Seeds were dispersed into in a solution containing 0.1% (v/v) Triton X-100 
(Sigma) and sterile H2O and washed by shaking the suspension for 20 min. The wash 
solution was then drained and replaced with fresh wash solution to wash the seeds for 
20 min with shaking. After removal of the second wash solution, a solution 
containing 0.1% (v/v) Triton X-100 and 70% ethanol (Equistar) was added to the 
seeds and the suspension was shaken for 5 min. After removal of the 
ethanol/detergent solution, a solution containing 0.1% (v/v) Triton X-100 and 30% 
(v/v) bleach (Clorox) was added to the seeds, and the suspension was shaken for 10 
min. After removal of the bleach/detergent solution, seeds were then washed five 
times in sterile distilled H2O. The seeds were stored in the last wash water at 4° C for 
2 days in the dark before being plated onto antibiotic selection medium (1 X 
Murashige and Skoog salts (pH adjusted to 5.7 with 1M KOH), 1 X Gamborg's B-5 
vitamins, 0.9% phytagar (Life Technologies), and 50 mg/1 kanamycin). Seeds were 
germinated under continuous illumination (50-75 ^E/m 2 /sec) at 22-23° C. After 7-10 
days of growth under these conditions, kanamycin resistant primary transformants (Tj 
generation) were visible and obtained. These seedlings were transferred first to fresh 
selection plates where the seedlings continued to grow for 3-5 more days, and then to 
soil (Pro-Mix BX potting medium). 

Primary transformants were crossed and progeny seeds (T2) collected; 
kanamycin resistant seedlings were selected and analyzed. The expression levels of 
the recombinant polynucleotides in the transformants varies from about a 5% 
expression level increase to a least a 100% expression level increase. Similar 
observations are made with respect to polypeptide level expression. 

Example VI: Identification of Arabidopsis Plants with Transcription Factor Gene 
Knockouts 

The screening of insertion mutegemzed Arabidopsis collections for null 
mutants in a known target gene was essentially as described in Krysan et al (1999) 
Plant Cell 1 1 :2283-2290. Briefly, gene-specific primers, nested by 5-250 base pairs 
to each other, were designed from the 5' and 3' regions of a known target gene. 
Similarly, nested sets of primers were also created specific to each of the T-DNA or 
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transposon ends (the "right 11 and "left" borders). All possible combinations of gene 
specific and T-DNA/transposon primers were used to detect by PCR an insertion 
event within or close to the target gene. The amplified DNA fragments were then 
sequenced which allows the precise determination of the T-DNA/transposon insertion 
point relative to the target gene. Insertion events within the coding or intervening 
sequence of the genes were deconvoluted Scorn a pool comprising a plurality of 
insertion events to a single unique mutant plant for functional characterization. The 
method is described in more detail in Yu and Adam, US Application Serial No. 
09/177,733 filed October 23, 1998. 

Example VII: Identification of Modified Phenotypes in Overexpression or Gene 
Knockout Plants 

Experiments were performed to identify those transformants or knockouts that 
exhibited modified biochemical characteristics. Among the biochemicals that were 
assayed were insoluble sugars, such as arabinose, fucose, galactose, mannose, 
rhamnose or xylose or the like; prenyl lipids, such as lutein, beta-carotene, 
xanthophyll-1, xanthophyll-2, chlorophylls A or B, or alpha-, delta- or gamma- 
tocopherol or the like; fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic 
acid), 18:0 (stearic acid), 18:1 (oleic acid), 18:2 (linoleic acid), 20:0 , 18:3 (linolenic 
acid), 20:1 (eicosenoic acid), 20:2, 22:1 (erucic acid) or the like; waxes, such as by 
altering the levels of C29, C31, or C33 alkanes; sterols, such as brassicasterol, 
campesterol, stigmasterol, sitosterol or stigmastanol or the like, glucosinolates, 
protein or oil levels . 

Fatty acids were measured using two methods depending on whether the tissue 
was from leaves or seeds. For leaves, lipids were extracted and esterified with hot 
methanolic H2SO4 and partitioned into hexane from methanolic brine. For seed fatty 
acids, seeds were pulverized and extracted in methanol:heptane:toluene:2,2- 
dimethoxypropane:H 2 S0 4 (39:34:20:5:2) for 90 minutes at 80°C. After cooling to 
room temperature the upper phase, containing the seed fatty acid esters, was subjected 
to GC analysis. Fatty acid esters from both seed and leaf tissues were analyzed with a 
Supelco SP-2330 column. 
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Glucosinolates were purified from seeds or leaves by first heating the tissue at 
95°C for 10 minutes. Preheated ethanoliwater (50:50) is and after heating at 95°C for 
a further 10 minutes, the extraction solvent is applied to a DEAE Sephadex column 
which had been previously equilibrated with 0.5 M pyridine acetate. 
Desulfoglucosinolates were eluted with 300 ul water and analyzed by reverse phase 
HPLC monitoring at 226 nm. 

For wax alkanes, samples were extracted using an identical method as fatty 
acids and extracts were analyzed on a HP 5890 GC coupled with a 5973 MSD. 
Samples were chromatographically isolated on a J&W DB35 mass spectrometer 
(J&W Scientific). 

To measure prenyl lipids levels, seeds or leaves were pulverized with 1 to 2% 
pyrogallol as an antioxidant. For seeds, extracted samples were filtered and a portion 
removed for tocopherol and carotenoid/chlorophyll analysis by HPLC. The 
remaining material was saponified for sterol determination. For leaves, an aliquot 
was removed and diluted with methanol and chlorophyll A, chlorophyll B, and total 
carotenoids measured by spectrophotometry by determining optical absorbance at 
665.2 nm, 652.5 nm, and 470 nm. An aliquot was removed for tocopherol and 
carotenoid/chlorophyll composition by HPLC using a Waters uBondapak CI 8 column 
(4.6 mm x 150 mm). The remaining methanolic solution was saponified with 10% 
KOH at 80°C for one hour. The samples were cooled and diluted with a mixture of 
methanol and water. A solution of 2% methylene chloride in hexane was mixed in 
and the samples were centrifuged. The aqueous methanol phase was again re- 
extracted 2% methylene chloride in hexane and, after centrifugation, the two upper 
phases were combined and evaporated. 2% methylene chloride in hexane was added 
to the tubes and the samples were then extracted with one ml of water. The upper 
phase was removed, dried, and resuspended in 400 ul of 2% methylene chloride in 
hexane and analyzed by gas chromatography using a 50 m DB-5ms (0.25 mm ID, 
0.25 um phase, J&W Scientific). 

Insoluble sugar levels were measured by the method essentially described by 
Reiter et al, (1997) Plant Journal 12:335-345. This method analyzes the neutral sugar 
composition of cell wall polymers found in Arabidopsis leaves. Soluble sugars were 
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separated from sugar polymers by extracting leaves with hot 70% ethanol. The 
remaining residue containing the insoluble polysaccharides was then acid hydrolyzed 
with allose added as an internal standard. Sugar monomers generated by the 
hydrolysis were then reduced to the corresponding alditols by treatment with NaBH4, 
then were acetylated to generate the volatile alditol acetates which were then analyzed 
by GC-FID. Identity of the peaks was determined by comparing the retention times 
of known sugars converted to the corresponding alditol acetates with the retention 
times of peaks from wild-type plant extracts. Alditol acetates were analyzed on a 
Supelco SP-2330 capillary column (30 m x 250 urn x 0.2 urn) using a temperature 
program beginning at 1 80° C for 2 minutes followed by an increase to 220° C in 4 
minutes. After holding at 220° C for 10 minutes, the oven temperature is increased to 
240° C in 2 minutes and held at this temperature for 10 minutes and brought back to 
room temperature. 

To identify plants with alterations in total seed oil or protein content, 150mg 
of seeds from T2 progeny plants were subjected to analysis by Near Infrared 
Reflectance Spectroscopy (NIRS) using a Foss NirSystems Model 6500 with a 
spinning cup transport system. NIRS is a non-destructive analytical method used to 
determine seed oil and protein composition. Infrared is the region of the 
electromagnetic spectrum located after the visible region in the direction of longer 
wavelengths. 'Near infrared' owns its name for being the infrared region near to the 
visible region of the electromagnetic spectrum. For practical purposes, near infrared 
comprises wavelengths between 800 and 2500 nm. MRS is applied to organic 
compounds rich in O-H bonds (such as moisture, carbohydrates, and fats), C-H bonds 
(such as organic compounds and petroleum derivatives), and N-H bonds (such as 
proteins and amino acids). The NIRS analytical instruments operate by statistically 
correlating NIRS signals at several wavelengths with the characteristic or property 
intended to be measured. All biological substances contain thousands of C-H, O-H, 
and N-H bonds. Therefore, the exposure to near infrared radiation of a biological 
sample, such as a seed, results in a complex spectrum which contains qualitative and 
quantitative information about the physical and chemical composition of that sample. 

The numerical value of a specific analyte in the sample, such as protein 
content or oil content, is mediated by a calibration approach known as chemometrics. 
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Chemometrics applies statistical methods such as multiple linear regression (MLR), 
partial least squares (PLS), and principle component analysis (PCA) to the spectral 
data and correlates them with a physical property or other factor, that property or 
factor is directly determined rather than the analyte concentration itself. The method 
first provides "wet chemistry" data of the samples required to develop the calibration. 

Calibration for Arabidopsis seed oil composition was performed using 
accelerated solvent extraction using 1 g seed sample size and was validated against 
certified canola seed. A similar wet chemistry approach was performed for seed 
protein composition calibration. 

Data obtained from NIRS analysis was analyzed statistically using a nearest- 
neighbor (N-N) analysis. The N-N analysis allows removal of within-block spatial 
variability in a fairly flexible fashion which does not require prior knowledge of the 
pattern of variability in the chamber. Ideally, all hybrids are grown under identical 
experimental conditions within a block (rep). In reality, even in many block designs, 
significant within-block variability exists. Nearest-neighbor procedures are based on 
assumption that environmental effect of a plot is closely related to that of its 
neighbors. Nearest-neighbor methods use information from adjacent plots to adjust 
for within-block heterogeneity and so provide more precise estimates of treatment 
means and differences. If there is within-plot heterogeneity on a spatial scale that is 
larger than a single plot and smaller than the entire block, then yields from adjacent 
plots will be positively correlated. Information from neighboring plots can be used to 
reduce or remove the unwanted effect of the spatial heterogeneity, and hence improve 
the estimate of the treatment effect. Data from neighboring plots can also be used to 
reduce the influence of competition between adjacent plots. The Papadakis N-N 
analysis can be used with designs to remove within-block variability that would not 
be removed with the standard split plot analysis (Papadakis, 1973, Inst. d'Amelior. 
Plantes Thessaloniki (Greece) Bull. Scientif, No. 23; Papadakis, 1984, Proc. Acad. 
Athens, 59, 326-342). 

Experiments were performed to identify those transformants or knockouts that 
exhibited an improved pathogen tolerance. For such studies, the transformants were 
exposed to biotropic fungal pathogens, such as Erysiphe orontii, and neurotropic 
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fungal pathogens, such as Fusariwn oxysporum. Fusarium oxysporwn isolates cause 
vascular wilts and damping off of various annual vegetables, perennials and weeds 
(Mauch-Mani and Slusarenko (1994) Molecular Plant-Microbe Interactions 7: 378- 
383). For Fusarium oxysporum experiments, plants grown on Petri dishes were 
sprayed with a fresh spore suspension of F. oxysporum. The spore suspension was 
prepared as follows: A plug of fungal hyphae from a plate culture was placed on a 
fresh potato dextrose agar plate and allowed to spread for one week. 5 ml sterile 
water was then added to the plate, swirled, and pipetted into 50 ml Armstrong 
Fusarium medium. Spores were grown overnight in Fusarium medium and then 
sprayed onto plants using a Preval paint sprayer. Plant tissue was harvested and 
frozen in liquid nitrogen 48 hours post infection. 

Erysiphe orontii is a causal agent of powdery mildew. For Erysiphe orontii 
experiments, plants were grown approximately 4 weeks in a greenhouse under 12 
hour light (20°C, -30% relative humidity (rh)). Individual leaves were infected with 
E. orontii spores from infected plants using a camel's hair brush, and the plants were 
transferred to a Percival growth chamber (20°C, 80% rh.). Plant tissue was harvested 
and frozen in liquid nitrogen 7 days post infection. 

Botrytis cinerea is a necrotrophic pathogen. Botrytis cinerea was grown on 
potato dextrose agar in the light. A spore culture was made by spreading 10 ml of 
sterile water on the fungus plate, swirling and transferring spores to 10 ml of sterile 
water. The spore inoculum (approx. 105 spores/ml) was used to spray 10 day-old 
seedlings grown under sterile conditions on MS (minus sucrose) media. Symptoms 
were evaluated every day up to approximately 1 week. 

Infection with bacterial pathogens Pseudomonas syringae pv maculicola (Psm) 
strain 4326 and pv maculicola strain 4326 was performed by hand inoculation at two 
doses. Two inoculation doses allows the differentiation between plants with enhanced 
susceptibility and plants with enhanced resistance to the pathogen. Plants were grown 
for 3 weeks in the greenhouse, then transferred to the growth chamber for the 
remainder of their growth. Psm ES4326 was hand inoculated with 1 ml syringe on 3 
fully-expanded leaves per plant (4 1/2 wk old), using at least 9 plants per 
overexpressing line at two inoculation doses, OD=0.005 and OD=0.0005. Disease 
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scoring occurred at day 3 post-inoculation with pictures of the plants and leaves taken 
in parallel. 

In some instances, expression patterns of the pathogen-induced genes (such as 
defense genes) was monitored by microarray experiments. cDNAs were generated by 
PCR and resuspended at a final concentration of - 100 ng/ul in 3X SSC or 150mM 
Na-phosphate (Eisen and Brown (1999) Methods Enzymol 303:179-205). The 
cDNAs were spotted on microscope glass slides coated with polylysine. The prepared 
cDNAs were aliquoted into 384 well plates and spotted on the slides using an x-y-z 
gantry (OmniGrid) purchased from GeneMachines (Menlo Park, CA) outfitted with . 
quill type pins purchased from Telechem International (Sunnyvale, CA). After 
spotting, the arrays were cured for a minimum of one week at room temperature, 
rehydrated and blocked following the protocol recommended by Eisen and Brown 
(1999; supra). 

Sample total RNA (10 ug) samples were labeled using fluorescent Cy3 and 
Cy5 dyes. Labeled samples were resuspended in 4X SSC/0.03% SDS/4 ug salmon 
sperm DNA/2 ug tRNA/ 50mM Na-pyrophosphate, heated for 95°C for 2.5 minutes, 
spun down and placed on the array. The array was then covered with a glass 
coverslip and placed in a sealed chamber. The chamber was then kept in a water bath 
at 62°C overnight. The arrays were washed as described in Eisen and Brown (1999) 
and scanned on a General Scanning 3000 laser scanner. The resulting files are 
subsequently quantified using hnagene, a software purchased from BioDiscovery 
(Los Angeles, CA). 

Experiments were performed to identify those transformants or knockouts that 
exhibited an improved environmental stress tolerance. For such studies, the 
transformants were exposed to a variety of environmental stresses. Plants were 
exposed to chilling stress (6 hour exposure to 4-8° C ), heat stress (6 hour exposure to 
32-37° C), high salt stress (6 hour exposure to 200 mM NaCl), drought stress (168 
hours after removing water from trays), osmotic stress (6 hour exposure to 3 M 
mannitol), or nutrient limitation (nitrogen, phosphate, and potassium) (Nitrogen: all 
components of MS medium remained constant except N was reduced to 20 mg/1 of 
NH4NO3, or Phosphate: All components of MS medium except KH2PO4, which was 
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replaced by K2SO4, Potassium: All components of MS medium except removal of 
KNO3 and KH2PO4, which were replaced by NaH4P0 4 ). 

Experiments were performed to identify those transformants or knockouts that 
exhibited a modified structure and development characteristics. For such studies, the 
transformants were observed by eye to identify novel structural or developmental 
characteristics associated with the ectopic expression of the polynucleotides or 
polypeptides of the invention. 

Experiments were performed to identify those transformants or knockouts that 
exhibited modified sugar-sensing. For such studies, seeds from transformants were 
germinated on media containing 5% glucose or 9.4% sucrose which normally partially 
restrict hypocotyl elongation. Plants with altered sugar sensing may have either 
longer or shorter hypocotyls than normal plants when grown on this media. 
Additionally, other plant traits may be varied such as root mass. 

Flowering time was measured by the number of rosette leaves present when a 
visible inflorescence of approximately 3 cm is apparent Rosette and total leaf number 
on the progeny stem are tightly correlated with the timing of flowering (Koornneef et 
al (1991) Mol Gen. Genet 229:57-66. The vernalization response was measured. For 
vernalization treatments, seeds were sown to MS agar plates, sealed with micropore 
tape, and placed in a 4°C cold room with low light levels for 6-8 weeks. The plates 
were then transferred to the growth rooms alongside plates containing freshly sown 
non-vernalized controls. Rosette leaves were counted when a visible inflorescence of 
approximately 3 cm was apparent. 

Modified phenotypes observed for particular overexpressor or knockout plants 
are provided in Table 4. For a particular overexpressor that shows a less beneficial 
characteristic, it may be more usefiil to select a plant with a decreased expression of 
the particular transcription factor. For a particular knockout that shows a less 
beneficial characteristic, it may be more useful to select a plant with an increased 
expression of the particular transcription factor. 
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The sequences of the Sequence Listing or those in Tables 4 , 5 or those 
disclosed here can be used to prepare transgenic plants and plants with altered traits. 
The specific transgenic plants listed below are produced from the sequences of the 
Sequence Listing, as noted. Table 4 provides exemplary polynucleotide and 
polypeptide sequences of the invention. Table 4 includes, from left to right for each 
sequence: the first column shows the polynucleotide SEQ ID NO; the second column 
shows the Mendel Gene ID No., GID; the third column shows the trait(s) resulting 
from the knock out or overexpression of the polynucleotide in the transgenic plant; 
the fourth column shows the category of the trait; the fifth column shows the 
transcription factor family to which the polynucleotide belongs; the sixth column 
("Comment 5 '), includes specific effects and utilities conferred by the polynucleotide 
of the first column; the seventh column shows the SEQ ID NO of the polypeptide 
encoded by the polynucleotide; and the eighth column shows the amino acid residue 
positions of the conserved domain in amino acid (AA) co-ordinates. 

Seed of plants overexpressing sequences G265 (SEQ ED NOs:871 and 872), 
G715 (SEQ ID NOs:925 and 926), G1471 (SEQ ID NOs:311 and 312), G1793 (SEQ 
ID NOs:365 and 366), G1838 (SEQ ID NOs:381 and 382), G1902 (SEQ ID NOs:405 
and 406), G286 (SEQ ID NOs:877 and 878), G2138 (SEQ ID NOs:865 and 866) and 
G2830 (SEQ ID NOs:875 and 876) was subjected to NIR analysis and a significant 
increase in seed oil content compared with seed from control plants was identified. 

G192: G192 (SEQ ID NO: 859) was expressed in all plant tissues and under 
all conditions examined. Its expression was slightly induced upon infection by 
Fusarium. G192 was analyzed using transgenic plants in which this gene was 
expressed under the control of the 35S promoter. G192 overexpressors were late 
flowering under 12 hour light and had more leaves than control plants. This 
phenotype was manifested in the three T2 lines analyzed. Results of one experiment 
suggest that Gl 92 overexpressor was more susceptible to infection with a moderate 
dose of the fungal pathogen Erysiphe orontii. The decrease in seed oil observed for 
one line was replicated in an independent experiment. G192 overexpression delayed 
flowering. A wide variety of applications exist for systems that either lengthen or 
shorten the time to flowering, or for systems of inducible flowering time control. In 
particular, in species where the vegetative parts of the plants constitute the crop and 
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the reproductive tissues are discarded, it will be advantageous to delay or prevent 
flowering. Extending vegetative development can bring about large increases in 
yields. G192 can be used to manipulate the defense response in order to generate 
pathogen-resistant plants. Gl 92 can be used to manipulate seed oil content, which 
can be of nutritional value. 

Closely Related Genes from Other Species 

G192 had some similarity within the conserved WRKY domain to non- 
Arabidopsis plant proteins. 

G1946: G1946 (SEQ ID NO: 801) was studied using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. 
Overexpression of G1946 resulted in accelerated flowering, with 35S::G1946 
transformants producing flower buds up to a week earlier than wild-type controls (24- 
hour light conditions). These effects were seen in 12/20 primary transformants and in 
two independent plantings of each of the three T2 lines. Unlike many early flowering 
Axabidopsis transgenic lines, which are dwarfed, 35S::G1946 transformants often 
reached full-size at maturity, and produced large quantities of seeds, although the 
plants were slightly pale in coloration and had slightly flat leaves compared to wild- 
type. In addition, 35S::G1946 plants showed an altered response to phosphate 
deprivation. Seedlings of G1946 overexpressor plants showed more secondary root 
growth on phosphate-free media, when compared to wild-type control. In a repeat 
experiment, all three lines showed the phenotype. Overexpression of G1946 in 
Arabidopsis also resulted in an increase in seed glucosinolate M39501 in T2 lines 
land 3. An increase in seed oil and a decrease in seed protein was also observed in 
these two lines. G1946 was ubiquitously expressed, and does not appear to be 
significantly induced or repressed by any of the biotic and abiotic stress conditions 
tested at this time, with the exception of cold, which repressed G1946 expression. 
G1946 can be used to modify flowering time, as well as to improve the plant's 
performance in conditions of limited phosphate, and to alter seed oil, protein, and 
glucosinolate composition. 
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Closely Related Genes from Other Species 

A comparison of the amino acid sequence of G1946 with sequences available 
from GenBank showed strong similarity with plant HSFs of several species 
(Lycopersicon peruvianum, Medicago truncatula, Lycopersicon esculentum, Glycine 
max, Solanum tuberosum, Oryza sativa and Hordeum vulgare subsp. vulgare). 

G375: The sequence of G375 (SEQ ID NO:239) was experimentally 
determined and G375 was analyzed using transgenic plants in which G375 was 
expressed under the control of the 35S promoter. Overexpression of G375 produced 
marked effects on leaf development. At early stages of growth, 35S::G375 seedlings 
developed narrow, upward pointing leaves with long petioles (possibly indicating a 
disruption in circadian-clock controlled processes or nyctinastic movements). 
Additionally, some seedlings were noted to have elongated hypocotyls, and some 
were rather small compared to wild-type controls. Comparable phenotypes were 
obtained by overexpression of an AP2 family gene, G21 13 (SEQ ID NO: 85). 
Following the switch to flowering, 35S::G375 plants showed reduced fertility, which 
possibly arose from a failure of stamens to fully elongate. One of the three T2 lines, 
(#41) was later flowering than wild-type controls, and also developed large numbers 
of small secondary rosette leaves in the axils of the primary rosette. Although these 
effects were not noted in the other two lines, the phenotypes obtained in line 41 were 
somewhat similar to those produced by overexpression of another Z-dof gene, G736 
(SEQ ID NO: 211). G375 was expressed in all tissues, although at different levels. It 
was expressed at low levels in the root and germinating seed, and expressed at high 
levels in the embryo. The effects of G375 on leaf architecture are of potential interest 
to the ornamental horticulture industry. 

Closely Related Genes from Other Species 

G375 showed some homology to non-Arabidopsis plant proteins within the 
conserved Dof domain. 

G1255: The sequence of G1255 (SEQ ID NO: 273) was experimentally 
determined and G1255 was analyzed using transgenic plants in which G1255 was 
expressed under the control of the 35S promoter. Plants overexpressing G1255 had 
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alterations in leaf architecture, a reduction in apical dominance, an increase in seed 
size, and showed more disease symptoms following inoculation with a low dose of the 
fungal pathogen Botrytis cinerea. G1255 was constitutively expressed and not 
significantly induced by any conditions tested. On the basis of the phenotypes 
produced by overexpression of G1255, G1255 can be used to manipulate the plant's 
defense response to produce pathogen resistance, alter plant architecture, or alter seed 
size. 

Closely Related Genes from Other Species 

G1255 showed strong homology to a putative rice zing finger protein 
represented by sequence AC087 1 8 1_3 . Sequence identity between these two protein 
extended beyond the conserved domain, and therefore, these genes can be orthologs. 

G865: The complete cDNA sequence of G865 (SEQ ID NO: 557) was 
determined. G865 was ubiquitously expressed in Arabidopsis tissues. G865 was 
analyzed using transgenic plants in which G865 was expressed under the control of 
the 35S promoter. Plants overexpressing G865 were early flowering, with numerous 
secondary inflorescence meristems giving them a bushy appearance. G865 
overexpressors were more susceptible to infection with a moderate dose of the fungal 
pathogens Erysiphe orontii and Botrytis cinerea. In addition, seeds from G865 
overexpressing plants showed a trend of increased protein and reduced oil content, 
although the observed changes were not beyond the criteria used forjudging 
significance except in one line. G865 can be used to control flowering time. G865 
can be used to manipulate the defense response in order to generate pathogen-resistant 
plants. G865 can be used to alter seed oil and protein content of a plant. 

Closely Related Genes from Other Species 

G865 and other non-Arabidopsis AP2/EREBP proteins were similar within the 
conserved AP2 domain. 

G2509: G2509 (SEQ ID NO: 23) was studied using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. Overexpression 
of G2509 caused multiple alterations in plant growth and development, most notably, 
altered branching patterns, and a reduction in apical dominance, giving the plants a 
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shorter, more bushy stature than wild type. Twenty 35S::G2509 primary 
transformants were examined; at early stages of rosette development, these plants 
displayed a wild-type phenotype. However, at the switch to flowering, almost all Tl 
lines showed a marked loss of apical dominance and large numbers of secondary 
shoots developed from axils of primary rosette leaves. In the most extreme cases, the 
shoots had very short intemodes, giving the inflorescence a very bushy appearance. 
Such shoots were often very thin and flowers were relatively small and poorly fertile. 
At later stages, many plants appeared very small and had a low seed yield compared 
to wild type. In addition to the effects on branching, a substantial number of 
35S::G2509 primary transformants also flowered early and had buds visible several 
days prior to wild type. Similar effects on inflorescence development were noted in 
each of three T2 populations examined. The branching and plant architecture 
phenotypes observed in 35S::G2509 lines resemble phenotypes observed for three 
other AP2/EREBP genes: G865 (SEQ ID NO: 557), G141 1 (SEQ ID NO: 3), and 
G1794 (SEQ ID NO: 13). G2509, G865, and G1411 form a small clade within the 
large AP2/EREBP family, and G1794, although not belonging to the clade, is one of 
the AP2/EREBP genes closest to it in the phylogenetic tree. It is thus likely that all . 
these genes share a related function, such as affecting hormone balance. 
Overexpression of G2509 in Arabidopsis resulted in an increase in alpha-tocopherol 
in seeds in T2 lines 5 and 1 1 . G2509 was ubiquitously expressed in Arabidopsis plant 
tissue. G2509 expression levels were altered by a variety of environmental or 
physiological conditions. G2509 can be used to manipulate plant architecture and 
development. G2509 can be used to alter tocopherol composition. Tocopherols have 
anti-oxidant and vitamin E activity. G2509 can be useful in altering flowering time. 
A wide variety of applications exist for systems that either lengthen or shorten the 
time to flowering. 

Closely Related Genes from Other Species 

G2509 showed some sequence similarity with known genes from other plant 
species within the conserved AP2/EREBP domain. 

G2347: G2347 (SEQ ID NO: 1 1 19) was analyzed using transgenic plants in 
which G2347 was expressed under the control of the 35S promoter. Overexpression 
of G2347 markedly reduced the time to flowering in Arabidopsis. This phenotype 
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was apparent in the majority of primary transformants and in all plants from two out 
of the three T2 lines examined. Under continuous light conditions, 35S::G2347 plants 
formed flower buds up a week earlier than wild type. Many of the plants were rather 
small and spindly compared to controls. To demonstrate that overexpression of 
G2347 could induce flowering under less inductive photoperiods, two T2 lines were 
re-grown in 12 hour conditions; again, all plants from both lines bolted early, with 
some initiating flower buds up to two weeks sooner than wild-type. As determined by 
RT-PCR, G2347 was highly expressed in rosette leaves and flowers, and to much 
lower levels in embryos and siliques. No expression of G2347 was detected in the 
other tissues tested. G2347 expression was repressed by cold, and by auxin 
treatments and by infection by Erysiphe. G2347 is also highly similar to the 
Arabidopsis protein G2010 (SEQ ID NO: 1 121). The level of homology between 
these two proteins suggested they could have similar, overlapping, or redundant 
functions in Arabidopsis. In support of this hypothesis, overexpression of both G2010 
and G2347 resulted in early flowering phenotypes in transgenic plants. 

Closely Related Genes from Other Species 

The closest relative to G2347 is the Antirrhinum protein, SBP2 (CAA63061). 
The similarity between these two proteins is extensive enough to suggest they might 
have similar functions in a plant. 

G988: G988 (SEQ ED NO: 43) was analyzed using transgenic plants in which 
G988 was expressed under the control of the 35S promoter. Plants overexpressing 
G988 had multiple morphological phenotypes. The transgenic plants were generally 
smaller than wild-type plants, had altered leaf, inflorescence and flower development, 
altered plant architecture, and altered vasculature. In one transgenic line 
overexpressing G988 (line 23), an increase in the seed glucosinolate M39489 was 
observed. The phenotype of plants overexpressing G988 was wild-type in all other 
assays performed. In wild-type plants, G988 was expressed primarily in flower and 
silique tissue, but was also present at detectable levels in all other tissues tested. 
Expression of G988 was induced in response to heat treatment, and repressed in 
response to infection with Erysiphe. Based on the observed morphological 
phenotypes of the transgenic plants, G988 can be used to create plants with larger 
flowers. This can have value in the ornamental horticulture industry. The reduction 
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in the formation of lateral branches suggests that G988 can have utility on the forestry 
industry. The Arabidopsis plants overexpressing G988 also had reduced fertility. 
This can be a desirable trait in some instances, as it can be exploited to prevent or 
minimize the escape of GMO (genetically modified organism) pollen into the 
environment. 

Closely Related Genes from Other Species 

The amino acid sequence for the Capsella rubella hypothetical protein 
represented by GenBank accession number CRU303349 was significantly identical to 
G988 outside of the SCR conserved domains. The Capsella rubella hypothetical 
protein is 90% identical to G988 over a stretch of roughly 450 amino acids. 
Therefore, it is likely that the Capsella rubella gene is an ortholog of G988. 

G2346: G2346 (SEQ ID NO: 459) was analyzed using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. 35S::G2346 
seedlings from all three T2 populations had slightly larger cotyledons and appeared 
somewhat more advanced than controls. This indicated that the seedlings developed 
more rapidly that the control plants. At later stages, however, G2346 overexpressing 
plants showed no consistent differences from control plants. The phenotype of these 
transgenic plants was wild-type in all other assays performed. According to RT-PCR 
analysis, G2346 is expressed ubiquitously. 

Closely Related Genes from Other Species 

G2346 shows some sequence similarity with known genes from other plant 
species within the conserved SBP domain. 

G1354: The complete sequence of G1354 (SEQ ID NO: 285) was determined. 
G1354 was analyzed using transgenic plants in which G1354 was expressed under the 
control of the 35S promoter. Overexpression of G1354 produced highly deleterious 
effects on growth and development. Only three 35S::G1354 Tl plants were obtained; 
all were extremely tiny and slow developing. After three weeks of growth, each of 
the plants comprised a completely disorganized mass of leaves and root that had no 
clear axis of growth. Since these individuals would not have survived transplantation 
to soil, they were harvested for RT-PCR analysis; all three plants showed moderate 
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levels of G1354 overexpression compared to whole wild-type seedlings of an 
equivalent size. Only a very small number of transformants were obtained from two 
selection attempts on separate batches of TO seed. Usually between 15 and 120 
transformants are obtained from each aliquot of 300 mg TO seed from wild-type 
plants. The low transformation frequency obtained in this experiment suggests that 
high levels of G1354 overexpression might have completely lethal effects and prevent 
transformed seeds from germinating. As determined by RT-PCR, G1354 was 
uniformly expressed in all tissues and under all conditions tested in RT-PCR. 
However, the gene was repressed in leaf tissue in response to Erysiphe infection. 

Closely Related Genes from Other Species 

G1354 is closely related to a NAM protein encoded by polynucleotide from 
rice (AC005310). Similarity between G1354 and this rice protein extends beyond the 
signature motif of the family to a level that would suggest the genes are orthologs. 

G1063: G1063 (SEQ ID NO: 1 19) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 113), G1499 (SEQ ID NO: 
7), G2143 (SEQ ID NO: 129), and G2557 (SEQ ID NO: 133). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
considered key regulators of carpel development. A spectrum of developmental 
alterations was observed amongst 35S::G1063 primary transformants and the majority 
were markedly small, dark green, and had narrow curled leaves. The most severely 
affected individuals were completely sterile and formed highly abnormal 
inflorescences; shoots often terminated in pin-like structures, and flowers were 
replaced by filamentous carpelloid structures. In other cases, flowers showed 
internode elongation between floral whorls, with a central carpel protruding on a 
pedicel-like organ. Additionally, lateral branches sometimes failed to develop and 
tiny patches of carpelloid tissue formed at axillary nodes of the inflorescence. In lines 
with an intermediate phenotype, flowers contained defined whorls of organs, but 
sepals were converted to carpelloid structures or displayed patches of carpelloid 
tissue. In contrast, lines with a weak phenotype developed relatively normal flowers 
and produced a reasonable quantity of seed. Such plants were still distinctly smaller 
than wild-type controls. Since the strongest 35S::G1063 lines were sterile, three lines 
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with a relatively weak phenotype, that had produced sufficient seed for biochemical 
and physiological analysis, were selected for further study. Two of the T2 
populations (T2-28,37) were clearly small, darker green and possessed narrow leaves 
compared to wild type. Plants from one of these populations (T2-28) also produced 
occasional branches with abnormal flowers like those seen in the Tl . The final T2 
population (T2-30) displayed a very mild phenotype. Overexpression of G1063 in 
Arabidopsis resulted in a decrease in seed oil content in T2 lines 28 and 37. No 
altered phenotypes were detected in any of the physiological assays, except that the 
plants were noted to be somewhat small and produce anthocyanin when grown in 
Petri plates. G1063 was expressed at low to moderate levels in roots, flowers, rosette 
leaves, embryos, and germinating seeds, but was not detected in shoots or siliques. It 
was induced by auxin. G1063 can be used to manipulate flower form and structure or 
plant fertility. One application for manipulation of flower structure can be in the 
production of saffron, which is derived from the stigmas of Crocus sativus. G1063 
has utility in manipulating seed oil and protein content. 

Closely Related Genes from Other Species 

G1063 protein shared extensive homology in the basic helix loop helix region 
with a protein sequence encoded by Glycine max cDNA clone (AW832545) as well 
as a tomato root, plants pre-anthesis Lycopersicon sculentum cDNA (BE451 174). 

G2143: G2143 (SEQ ID NO: 129) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 1 13), G1063 (SEQ ID NO: 
119), G1499 (SEQ ID NO: 7), and G2557 (SEQ ID NO: 133). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
.considered key regulators of carpel development. Twelve out of twenty 35S::G2143 
Tl lines showed a very severe phenotype; these plants were markedly small and had 
narrow, curled, dark-green leaves. Such individuals were completely sterile and 
formed highly abnormal inflorescences; shoots often terminated in pin-like structures, 
and flowers were replaced by filamentous carpelloid structures, or a fused mass of 
carpelloid tissue. Furthermore, lateral branches usually failed to develop, and tiny 
patches of stigmatic tissue often formed at axillary nodes of the inflorescence. 
Strongly affected plants displayed the highest levels of transgene expression 
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(determined by RT-PCR). The remaining Tl lines showed lower levels of G2143 
overexpression; these plants were still distinctly smaller than wild type, but had 
relatively normal inflorescences and produced seed. Since the strongest 35S::G2143 
lines were sterile, three lines with a relatively weak phenotype, that had produced 
sufficient seed for biochemical analysis, were selected for further study. T2-1 1 plants 
displayed a very mild phenotype and had somewhat small, narrow, dark green leaves. 
The other two T2 populations, however, appeared wild-type, suggesting that 
transgene activity might have been reduced between the generations. Reduced 
seedling vigor was noted in the physiological assays. G2143 expression was detected 
at low levels in flowers and siliques, and at higher levels in germinating seed. G2143 
can be used to manipulate flower form and structure or plant fertility. One application 
for manipulation of flower structure can be in the production of saffron, which is 
derived from the stigmas of Crocus sativus. 

Closely Related Genes from Other Species 

G2143 protein shared extensive homology in the basic helix loop helix region 
with a protein encoded by Glycine max cDNA clones (AW832545, BG726819 and 
BG1 54493) and a Lycopersicon esculentum cDNA clone (BE451 174). There was 
lower homology outside of the region. 

G2557: G2557 (SEQ ED NO: 133) is a member of a clade of highly related 
HLH/MYC proteins that also includes G779 (SEQ ID NO: 113), G1063 (SEQ ID NO: 
119), G1499 (SEQ ED NO: 7), and G2143 (SEQ ID NO: 129). All of these genes 
caused similar pleiotropic phenotypic effects when overexpressed, the most striking 
of which was the production of ectopic carpelloid tissue. These genes can be 
considered key regulators of carpel development. The flowers of 35S::G2557 primary 
transformants displayed patches of stigmatic papillae on the sepals, and often had 
rather narrow petals and poorly developed stamens. Additionally, carpels were also 
occasionally held outside of the flower at the end of an elongated pedicel like 
structure. As a result of such defects, 35S::G2557 plants often showed very poor 
fertility and formed small wrinkled siliques. In addition to such floral abnormalities, 
the majority of primary transformants were also small and darker green in coloration 
than wild type. Approximately one third of the Tl plants were extremely tiny and 
completely sterile. Three Tl lines (#7,9,12), that had produced some seeds, and 
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showed a relatively weak phenotype, were chosen for further study. All three of the 
T2 populations from these lines contained plants that were distinctly small, had 
abnormal flowers, and were poorly fertile compared to controls. Stigmatic tissue was 
not noted on the sepals of plants from these three T2 lines. Another line (#4) that had 
shown a moderately strong phenotype in the Tl was sown for only morphological 
analysis in the T2 generation. T2-4 plants were small, dark green, and produced 
abnormal flowers with ectopic stigmatic tissue on the sepals, as had been seen in the 
parental plant. G2557 expression was detected at low to moderate levels in all tissues 
tested except shoots. It was induced by cold, heat, and salt, and repressed by 
pathogen infection. G1063 can be used to manipulate flower form and structure or 
plant fertility. One application for manipulation of flower structure can be in the 
production of saffron, which is derived from the stigmas of Crocus sativus. 

Closely Related Genes from Other Species 

G2557 protein shows extensive sequence similarity in the region of basic helix 
loop helix with a protein encoded by Glycine max cDNA clone (BE34781 1). 

G2430: The complete sequence of G2430 (SEQ ID NO: 697) was 
determined. G2430 is a member of the response regulator class of GARP proteins 
(ARR genes), although one of the two conserved aspartate residues characteristic of 
response regulators is not present. The second aspartate, the putative phosphorylated 
site, is retained so G2430 can have response regulator function. G2430 is specifically 
expressed in embryo and silique tissue. In morphological analyses, plants 
overexpressing G2430 showed more rapid growth than control plants at early stages, 
and in two of three lines examined produced large, flat leaves. Early flowering was 
observed for some lines, but this effect was inconsistent between plantings. G2430 
can regulate plant growth. Overexpression of G2430 in Arabidopsis also resulted in 
seedlings that are slightly more tolerant to heat in a germination assay. Seedlings 
from G2430 overexpressing transgenic plants were slightly greener than the control 
seedlings under high temperature conditions. In a repeat experiment on individual 
lines, G2430 line 15 showed the strongest heat tolerant phenotype. G2430 can be 
useful to promote faster development and reproduction in plants. 
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Closely Related Genes fr om Other S pecies 

G2430 had some similarity within of the conserved GARP and response- 
regulator domains to non-Arabidopsis proteins. 

G1478: The sequence of G1478 (SEQ ID NO: 831) was determined and 
G1478 was analyzed using transgenic plants in which G1478 was expressed under the 
control of the 35S promoter. Plants overexpressing G1478 had a general delay in 
progression through the life cycle, in particular a delay in flowering time. G1478 is 
expressed at higher levels in flowers, rosettes and embryos but otherwise expression 
is constitutive. Based on the phenotypes produced through G1478 overexpression, 
G1478 can be used to manipulate the rate at which plants grow, and flowering time. 

Closely Related Genes from Other Species 

G1478 shows some homology to non-Arabidopsis proteins within the 
conserved domain. 

G681 : G681 (SEQ ID NO: 579) was analyzed using transgenic plants in 
which the gene was expressed under the control of the 35S promoter. Approximately 
half of the 35S::G681 primary transformants were markedly small and formed narrow 
leaves compared to controls. These plants often produced thin inflorescence stems, 
had rather poorly formed flowers with low pollen production, and set few seeds. 
Three Tl lines with relatively weak phenotypes, which had produced reasonable 
quantities of seed, were selected for further study. Plants from one of the T2 
populations were noted to be slightly small, but otherwise the T2 lines displayed no 
consistent differences in morphology from controls. In leaves of two of the T2 lines, 
overexpression of G681 resulted in an increase in the percentage of the glucosinolate 
M39480. According to RT-PCR analysis, G681 expression was detected at very low 
levels in flower and rosette leaf tissues. G681 was induced by drought stress. G681 
can be used to alter glucosinolate composition in plants. Increases or decreases in 
specific glucosinolates or total glucosinolate content are desirable depending upon the 
particular application. For example: (1) Glucosinolates are undesirable components 
of the oilseeds used in animal feed, since they produce toxic effects. Low- 
glucosinolate varieties of canola have been developed to combat this problem. (2) 
Some glucosinolates have anti-cancer activity; thus, increasing the levels or 



152 



WO 03/013227 



PCT/US02/25805 



composition of these compounds might be of interest from a nutraceutical standpoint. 
(3) Glucosinolates form part of a plants natural defense against insects. Modification 
of glucosinolate composition or quantity could therefore afford increased protection 
from predators. Furthermore, in edible crops, tissue specific promoters can be used to 
ensure that these compounds accumulate specifically in tissues, such as the epidermis, 
which are not taken for consumption. 

Closely Related Genes from Other Species 

G681 shows some sequence similarity with known genes from other plant 
species within the conserved Myb domain. 

G878: G878 (SEQ ID NO: 61 1) was studied using transgenic plants in which 
the gene was expressed under the control of the 35S promoter. Analysis of primary 
transformants revealed that overexpression of G878 delays the: onset of flowering in.\ 
Arabidopsis. 1 1/20 of the 35S::G878 Tl plants flowered approximately one week 
later than wild type under continuous light conditions. These plants were also darker 
green, had shorter stems, and senesced later than controls. G878 was ubiquitously 
expressed. G878 can be used to modify flowering time and senescence, and a wide 
variety of applications exist for systems that either lengthen or shorten the time to 
flowering. 

Closely Related Genes from Other Species 

G878 was highly related to other WRKY proteins from a variety of plant 
species, such as the Nicotiana tabacum DNA-binding protein 2 (WRKY2) 
(AF096299), and a Cucumis sativus SPFl-like DNA-binding protein (L44134). 

G374: G374 (SEQ ID NO: 47) was expressed at low levels throughout the 
plant and was induced by salicylic acid. G374 was investigated using lines carrying a 
T-DNA insertion in this gene. The T-DNA insertion was approximately three 
quarters of the way into the protein coding sequence and should result in a null 
mutation. Homozygosity for a T-DNA insertion within G374 caused lethality at early 
stages of embryo development. In an initial screen for G374 knockouts, heterozygous 
plants were identified. Seed from those individuals was sown to soil and eleven 
plants were PCR-screened to identify homozygotes. No homozygotes were obtained; 
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6 of the progeny were heterozygous whilst the other 5 were wild type. This raised the 
prospect that homozygosity for the G374 insertion was lethal. To examine this 
possibility further, heterozygous KO.G374 plants were re-grown. These individuals 
looked wild type, but their siliques were examined for seed abnormalities. When 
green siliques were dissected, around 25% of developing seeds were white or aborted. 
Embryos from these siliques were cleared using Hoyers solution, and examined under 
the microscope. It was apparent that embryos from the white seeds had anrested at 
early (globular or heart) stages of development, whilst embryos from the normal seeds 
were fully developed. Such arrested or aborted seeds most likely represented 
homozygotes for the G374 insertion. To support this conclusion, seed was collected 
from heterozygous plants and sown to kanamycin plates (the T-DNA insertion carried 
the NPT marker gene); Of the seedlings that germinated, 160 were kanamycin 
resistant and 107 were kanamycin sensitive. These data more closely fitted a 2:1 (chi- 
sq., ldf, = 5.5, 0.05>P>0.01) than a 3:1 (chi-sq., Idf, = 32, PO.001) ratio. Such a 
segregation ratio suggested that a homozygous class of kanamycin resistant seedlings 
was absent from the progeny of KO.G374 plant. G374 can be a herbicide target. 

Closely Related Genes from Other Species 

Similar sequences to G374 are present in tomato and Medicago truncatula, and 
these sequences can be orthologs. 

Example VIE: Identification of Homologous Sequences 

Homologous sequences from Arabidopsis and plant species other than 
Arabidopsis were identified using database sequence search tools, such as the Basic 
Local Alignment Search Tool (BLAST) (Altschul et al. (1990) J. Mol. Biol 215:403- 
410; and Altschul et al. (1997) Nucl. Acid Res. 25: 3389-3402). The tblastx sequence 
analysis programs were employed using the BLOSUM-62 scoring matrix (Henikoff, 
S. and Henikoff, J. G. (1992) Proc. Natl. Acad. Sci. USA 89: 10915-10919). 

Identified non- Arabidopsis sequences homologous to the Arabidopsis 
sequences are provided in Table 5. The percent sequence identity among these 
sequences can be as low as 47%, or even lower sequence identity. The entire NCBI 
GenBank database was filtered for sequences from all plants except Arabidopsis 
thaliana by selecting all entries in the NCBI GenBank database associated with NCBI 
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taxonomic ID 33090 (Viridiplantae; all plants) and excluding entries associated with 
taxonomic ID 3701 (Arabidopsis thaliana). These sequences are compared to 
sequences representing genes of SEQ IDs NOs:2 - 2N, where N = 2-561, using the 
Washington University TBLASTX algorithm (version 2.0al9MP) at the default 
settings using gapped alignments with the filter "off \ For each gene of SEQ IDs 
NOs:2 - 2N, where N = 2-561, individual comparisons were ordered by probability 
score (P-value), where the score reflects the probability that a particular alignment 
occurred by chance. For example, a score of 3.6e-40 is 3.6 x 10" 40 . In addition to P- 
values, comparisons were also scored by percentage identity. Percentage identity 
reflects the degree to which two segments of DNA or protein are identical over a 
particular length. Examples of sequences so identified are presented in Table 5. 
Homologous or orthologous sequences are readily identified and available in 
GenBank by Accession number (Table 5; Test sequence ID). The identified 
homologous polynucleotide and polypeptide sequences and homologues of the 
Arabidopsis polynucleotides and polypeptides may be orthologs of the Arabidopsis 
polynucleotides and polypeptides (TBD: to be determined). 

Example IX Introduction of polynucleotides into dicotyledonous plants 

SEQ ID NOs:l-(2N - 1), wherein N = 2-561, paralogous, orthologous, and 
homologous sequences recombined into pMEN20 or pMEN65 expression vectors are 
transformed into a plant for the purpose of modifying plant traits. The cloning vector 
may be introduced into a variety of cereal plants by means well-known in the art such 
as, for example, direct DNA transfer or Agrobacterium tumefaciens-medizted 
transformation. It is now routine to produce transgenic plants using most dicot plants 
(see Weissbach and Weissbach, (1989; supra; Gelvin et al., (1990) supra; Herrera- 
Estrella et al. (1983) supra; Bevan (1984) supra; and Klee (1985) supra). Methods 
for analysis of traits are routine in the art and examples are disclosed above. 

Example X Transformation of Cereal Plants with an Expression Vector 

Cereal plants such as corn, wheat, rice, sorghum or barley, may also be 
transformed with the present polynucleotide sequences in pMEN20 or pMEN65 
expression vectors for the purpose of modifying plant traits. For example, pMEN020 
may be modified to replace the Nptll coding region with the BAR gene of 
Streptomyces hygroscopicus that confers resistance to phosphinothricin. The Kpnl 
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and Bgin sites of the Bar gene are removed by site-directed mutagenesis with silent 
codon changes. 

The cloning vector may be introduced into a variety of cereal plants by means 
well-known in the art such as, for example, direct DNA transfer or Agrobacterium 
tumefaciens-mediated transformation. It is now routine to produce transgenic plants 
of most cereal crops (Vasil, I., Plant Molec. Biol. 25: 925-937 (1994)) such as corn, 
wheat, rice, sorghum (Cassas, A. et al., Proc. Natl. Acad Sci USA 90: 11212-1 1216 
(1993) and barley (Wan, Y. and Lemeaux, P. Plant Physiol. 104:37-48 (1994). DNA 
transfer methods such as the microprojectile can be used for corn (Fromm. et al. 
Bio/Technology 8: 833-839 (1990); Gordon-Kamm et al. Plant Cell 2: 603-618 
(1990); Ishida, Y., Nature Biotechnology 14:745-750 (1990)), wheat (Vasil, et al. 
Bio/Technology 10:667-674 (1992) ; Vasil et al., Bio/Technology 1 1:1553-1558 
(1993); Weeks et al., Plant Physiol. 102:1077-1084 (1993)), rice (Christou 
Bio/Technology 9:957-962 (1991); Hiei et al. Plant J. 6:271-282 (1994); Aldemita 
and Hodges, Planta 199:612-617; Hiei et al., Plant Mol Biol. 35:205-18 (1997)). For 
most cereal plants, embryogenic cells derived from immature scutellum tissues are the 
preferred ceDular targets for transformation (Hiei et al., Plant Mol Biol. 35:205-18 
(1997); Vasil, Plant Molec. Biol. 25: 925-937 (1994)). 

Vectors according to the present invention may be transformed into corn 
embryogenic cells derived from immature scutellar tissue by using microprojectile 
bombardment, with the A188XB73 genotype as the preferred genotype (Fromm, et 
al., Bio/Technology 8: 833-839 (1990); Gordon-Kamm et al., Plant Cell 2: 603-618 
(1990)). After microprojectile bombardment the tissues are selected on 
phosphinothricin to identify the transgenic embryogenic cells (Gordon-Kamm et al., 
Plant Cell 2: 603-618 (1990)). Transgenic plants are regenerated by standard corn 
regeneration techniques (Fromm, et al., Bio/Technology 8: 833-839 (1990); Gordon- 
Kamm et al., Plant Cell 2: 603-618 (1990)). 

The plasmids prepared as described above can also be used to produce 
transgenic wheat and rice plants (Christou, Bio/Technology 9:957-962 (1991); Hiei et 
al., Plant J. 6:271-282 (1994); Aldemita and Hodges, Planta 199:612-617 (1996); 
Hiei et al., Plant Mol Biol. 35:205-18 (1997)) that coordinately express genes of 
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interest by following standard transformation protocols known to those skilled in the 
art for rice and wheat Vasil, et aL Bio/Technology 10:667-674 (1992) ; Vasil et aL, 
Bio/Technology 11:1553-1558(1993); Weeks et al. Plant Physiol. 102:1077-1084 
(1993)), where the bar gene is used as the selectable marker. 

All references, publications, patent documents, web pages, and other 
documents cited or mentioned herein are hereby incorporated by reference in their 
entirety for all purposes. Although the invention has been described with reference to 
specific embodiments and examples, it should be understood that one of ordinary skill 
can make various modifications without departing from the spirit of the invention. 
The scope of the invention is not limited to the specific embodiments and examples 
provided. 
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We claim: 

1 . A transgenic plant comprising a recombinant polynucleotide having a 
nucleotide sequence selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected 
from those of SEQ ID NOs: 860, 802, 240, 274, 558, 24, 1120, 44, 460, 286, 120, 
130, 134, 698, 832, 580, 612, and 48, or a complementary nucleotide sequence 
thereof; 

(b) a nucleotide sequence of SEQ ID NOs: 859, 801, 239, 273, 557, 23, 1 1 19, 43, 459, 
285, 119, 129, 133, 697, 831, 579, 611, 47, or a complementary nucleotide sequence 
thereof; and 

(c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 
sequence of one or more polynucleotides of: (a) or (b). 

2. The transgenic plant of claim 1 wherein the transgenic plant possesses an 
altered trait as compared to another plant, or the transgenic plant exhibits an altered 
phenotype as compared to another plant, or the transgenic plant expresses an altered 
level of one or more genes associated with a plant trait as compared to another plant, 
wherein the other plant does not comprise the recombinant polynucleotide. 

3. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in one or more physical 
characteristics selected from the group consisting of: the number of trichomes, fruit 
and seed size and number, yield of stems, leaves, inflorescences, or roots, stability of 
seeds during storage, susceptibility of the seed to shattering, root hair length and 
quantity, intemode distances, or the quality of seed coat. 

4. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in a plant growth 
characteristic selected from the group consisting of: growth rate, germination rate of 
seeds, vigor of plants and seedlings, leaf and flower senescence, male sterility, 
apomixis, flowering time, flower abscission, rate of nitrogen uptake, osmotic 
sensitivity to soluble sugar concentrations, biomass or transpiration characteristics, 
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apical dominance, branching patterns, number of organs, organ identity, and organ 
shape or size. 



5. The transgenic plant of claim 1 wherein the plant possesses an altered trait as 
compared to another plant wherein the trait is an alteration in one or more 
characteristics selected from the group consisting of protein or oil production, seed 
protein or oil production, insoluble sugar level, soluble sugar level, and starch 
composition. 

6. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:860. 

7. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 802. 

8. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO.240. 

9. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:274. 

10. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:558. 

1 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:24. 

1 2 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 1120. 

1 3 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:44. 
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14. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:460. 



15. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:286. 

1 6. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 120. 

17. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 130. 

18. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 134. 

19. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:698. 

20. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:832. 

2 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO: 5 80. 

22. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:612. 

23. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence encoding a polypeptide of SEQ ID NO:48. 

24. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:859. 
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25 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ED NO:801. 



26. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:239. 

27. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:273. 

28. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:557. 

29. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:23. 

30. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 1 1 19. 

3 1 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:43. 

32. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:459. 

33. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:285. 

34. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:l 19. 

35. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 129. 



161 



WO 03/013227 PCT/US02/25805 

36. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:133. 



37. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:697. 

38. The transgenic plant of claim 1, wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 831 . 

39. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO: 579. 

40. The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:61 1 . 

41 . The transgenic plant of claim 1 , wherein the recombinant polynucleotide 
comprises a nucleotide sequence of SEQ ID NO:47. 

42. The transgenic plant of claim 1, further comprising a constitutive, inducible, 
or tissue-specific promoter operably linked to said nucleotide sequence. 

44. The transgenic plant of claim 1, wherein the plant is selected from the group 
consisting of: soybean, wheat, corn, potato, cotton, rice, oilseed rape, sunflower, 
alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, raspberry, 
cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, honeydew, 
lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, pumpkin, spinach, 
squash, sweet corn, tobacco, tomato, watermelon, mint and other labiates, rosaceous 
fruits, and vegetable brassicas. 

44. The transgenic plant of claim 1 wherein the encoded polypeptide is expressed 
and regulates transcription of a gene. 
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45. A method of using the transgenic plant of claim 1 to grow a progeny plant 
from a parent plant, the method comprising crossing the transgenic plant with another 
plant, selecting seed, and growing the progeny plant from the seed. 

46. An isolated or recombinant polynucleotide comprising a nucleotide sequence 
selected from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected 
from SEQ ID NOs: 240, 274, 558, 286, 698, and 832, or a complementary nucleotide 
sequence thereof; 

(b) a nucleotide sequence of SEQ ID NOs:239, 273, 557, 285, 697, 831, or a 
complementary nucleotide sequence thereof; and 

(c) a nucleotide sequence that hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a) or (b). 

47. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:240. 

48. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:274. 

49. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:558. 

50. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQE)NO:286. 

5 1 . The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQIDNO:698. 
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52. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence encoding a polypeptide 
ofSEQEDNO:832. 

53. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:239. 

54. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:273. 

55. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:557. 

56. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:285. 

57. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:697. 

58. The isolated or recombinant polynucleotide of claim 46, wherein the 
recombinant polynucleotide comprises a nucleotide sequence of SEQ ID NO:83 1 . 

59. The isolated or recombinant polynucleotide of claim 46, further comprising a 
constitutive, inducible, or tissue-specific promoter operably linked to the nucleotide 
sequence. 

60. The isolated or recombinant polynucleotide of claim 46 wherein the encoded 
polypeptide is expressed and regulates transcription of a gene. 

61 . A vector comprising the isolated or recombinant polynucleotide of claim 46. 

62. A host cell comprising the vector of claim 61 . 
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63. A method of using the isolated or recombinant polynucleotide of claim 46 for 
producing a plant having a modified trait, the method comprising selecting a 
polynucleotide that encodes a polypeptide, inserting the polynucleotide into an 
expression vector, introducing the vector into a plant or a cell of a plant to 
overexpress the polypeptide, thereby producing a modified plant, and selecting a 
modified plant for a modified trait. 

64. The method of claim 63 wherein the plant possesses a modified trait as 
compared to another plant wherein the trait is an alteration in one or more physical 
characteristics selected from the group consisting of: the number of trichomes, fruit 
and seed size and number, yield of stems, leaves, inflorescences, or roots, stability of 
seeds during storage, susceptibility of the seed to shattering, root hair length and 
quantity, internode distances, or the quality of seed coat. 

65. The method of claim 63 wherein the plant possesses a modified as compared 
to another plant wherein the trait is an alteration in a plant growth characteristic 
selected from the group consisting of: growth rate, germination rate of seeds, vigor of 
plants and seedlings, leaf and flower senescence, male sterility, apomixis, flowering 
time, flower abscission, rate of nitrogen uptake, osmotic sensitivity to soluble sugar 
concentrations, biomass or transpiration characteristics, apical dominance, branching 
patterns, number of organs, organ identity, and organ shape or size. 

66. The method of claim 63 wherein the plant possesses a modified trait as 
compared to another plant wherein the trait is an alteration in one or more 
characteristics selected from the group consisting of protein or oil production, seed 
protein or oil production, insoluble sugar level, soluble sugar level, and starch 
composition. 

67. A modified plant produced by the method of claim 63. 

68. A method of using the plant of claim 67 to grow a progeny plant from a parent 
plant, the method comprising crossing the transgenic plant with another plant, 
selecting seed, and growing the progeny plant from the seed. 
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69. The plant produced by the method of claim 68. 
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SEQUENCE LISTING 

<110> Mendel Biotechnology, Inc. 
Ratcliffe, Oliver 
Riechmann, Jose Luis 
Adam, Luc J. 
Dubell, Arnpld T. 
Heard, Jacqueline E. 
Pilgrim, Marsha L. 
Jiang, Cai-Zhong 
Reuber, T. Lynne 
Creelman, Robert A. 
Pineda, Omaira 
Yu, Guo-Liang 
Broun, Pierre E. 

<120> YIELD - RELATED POLYNUCLEOTIDES AND 
POLYPEPTIDES IN PLANTS 

<130> 514442002041 

<150> 60/310,847 
<151> 2001-08-09 

<150> 60/336,049 
<151> 2001-11-19 

<150> 60/338,692 
<151> 2001-12-11 

<150> 10/171,468 
<151> 2002-06-14 

>G1275 (58.. 579) 

CCAAGAAAAGGGAAGATCACGCATTCTTATAGGCGTAATTCGTAAATAGTGGTGAGTATG 
AATGATGCAGACACAAACTTGGGGAGTAGTTTCAGCGATGATACTCACTCTGTGTTCGAG 
TTTCCGGAGCTAGACTTGTCAGATGAATGGATGGATGATGATCTTGTGTCTGCGGTTTCC 
GGGATGAATCAGTCTTATGGTTATCAGACTAGTGATGTTGCTGGTGCTTTATTCTCAGGT 
TCTTCTAGCTGTTTCAGTCATCCTGAATCTCCAAGTACCAAAACTTATGTTGCTGCTACA 
GCCACTGCTTCTGCCGACAACCAAAACAAGAAAGAAAAGAAAAAAATTAAAGGGAGAGTT 
GCGTTCAAGACACGGTCCGAGGTGGAAGTGCTTGACGACGGGTTCAAGTGGAGAAAGTAT 
GGGAAGAAGATGGTGAAGAACAGCCCACATCCAAGAAACTACTACAAATGTTCAGTTGAT 
GGCTGTCCCGTGAAGAAAAGGGTTGAACGAGACAGAGATGATCCGAGCTTTGTGATAACA 
ACTTACGAGGGTTCCCACAATCACTCAAGCATGAACTAAGACTCGAACTAAGGCTCAAGG 
CGACC^TGCTATATTC^GCA<^TCTTATTTTCTATGGTTACGAACGATACTTAAAACTGC 
TTCTAGTTCTTTATATCC^TTGTAAACTGGTTGCAGGTTCACAAATTTTGAGAGGTTTAT 
GACATTCTAAATCTG-TAGTACTTATATA 

>G1275 Amino Acid Sequence (domain in AA coordinates: 113-169) 

MNDADTNLGSSFSDDTHSVFEFPELDLSDEWMDDDLVSAVSGMNQSYGYQTSDVAGALFS 

GSSSCFSHPESPSTKTYVAATATASADNQNKKEKKKIKG^ 

YGKKMVKNS PHPRNYYKCS VDGCP VKKRVERDRDDPS F VI TTYEGSHNHS SMN* 

>G1411 (110.. 856) 

TAAAGAAAAACTGAACAACCCTAAAGTACTGTATAAATCCTATATCAAATTTTTTTTTTG 
GAAGAAAAGGCTATATTTAAAAGAAAATCAAGCAAAAGTAGATCCTCGGATGTATGGGAA 
GAGGCCTTTTGGAGGTGATGAATCTGAAGAAAGGGAAGAAGATGAGAACTTGTTCCCGGT 
CTTCTCGGCCCGATCTCAACACGACATGCGTGTTATGGTCTCGGCCTTGACTCAAGTAAT 
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CGGAAACCAACAAAGCAAATCTCATGATAACATCAGCTCTATTGATGATAACTATCCTTC 
TGTGTATAATCCACAAGACCCTAATCAACAAGTTGCGCCTACTCATCAAGACCAAGGGGA 
CTTGAGGAGGAGACATTATAGAGGTGTAAGGCAAAGGCCATGGGGAAAGTGGGCAGCTGA 
AATCCGAGACCCAAAAAAGGCGGCACGTGTGTGGCTCGGGACATTTGAAACCGCTGAATC 
TGCGGCCTTAGCTTATGATGAAGCAGCCCTAAAGTTCAAAGGAAGCAAAGCAAAACTCAA 
TTTCCCGGAGAGGGTTCAGCTTGGAAGTAACTCT^ 

ACAAATGGAACCACAAAGTATACCGAACTATAATCAATACTATCATGATGCGAGTAGTGG 

TGATATGCTAAGTTTTAATTTGGGCGGTGGGTATGGGAGTGGTACCGGATATTCAATGTC 

TCATGATAATAGTACTACGACTCCTGCTACAACTTCTTCGTCTTCTGGTGGCTCTTCTAG 

GCAACAAGAAGAGCAAGATTATGCCAGATTOTGGCGCTTTGGGGAT^ 

TCATTCGGGATATTAATTAGGAGATTTGATCAGTTACTTGTGATGAAGTAATGATACATT 

TCCCGTCAAAATTGAGATGATCATATGCTTCCTGAATGTTTTTGAGTGTCATTTTTGTCT 

TCCGCGTTAAGATTTATTGAACGTGTTTT 

AAAAAAAAAAAAAA 

>G1411 Amino Acid Sequence (domain in AA coordinates: 87-154) 

MYGKRPFGGDESEEREEDENLFPVFSARSQHDMRVMVSALTQVIGNQQSKSHDNISSIDD 

NYPSVYOTQDPNQQVAPTHQDQGDLRRRHYRG^ 

TAESAALAYDEAALKFKGSKAKLNFPERVQLGSNSTYYSSNQ I PQMEPQS I PNYNQYYHD 
AS SGDMLS FNLGGGYGSGTG YS MSHDNS TTTAATTS S S SGGS SRQQEEQDYARFWRFGDS 
SSSPHSGY* 
>G1488 (1..996) 

ATGGAAGATGAAGCACATGAATTCTTCCACACATCTGATTTTGCCGTTGATGACCTTTTA 
GTTGATTTCTCTAACGATGATGACGAAGAAAACGATGTTGTTGCTGATTCCACCACTACC 
AC(^CCATAACCGA(^GCTCTAACTTCTCCGCTGCTGATCTTCCCAGTTTCCACGGTGAT 
GTTCAAGACGGCACTAGCTTCTCCGGTGACCTTTGTATACCTTCTGATGATTTGGCTGAT 
GAGTTAGAGTGGCTTTCGAACATTGTGGATGAATCATTGTCGCCTGAAGATGTACACAAG 
CTCGAGCTAATATCCGGTTTTAAGAGTCGACCGGACCCGAAATCCGATACCGGAAGCCCG 
GAAAACCCGAATAGCAGCAGTCCGATTTTTACTACCGACGTTTCTGTACCGGCCAAAGCT 
AGAAGCA^CGCTCACGCGCCGCTGCGTGTAATTGGGCCTCACGTGGGCTTCTCAAGGAA 
ACGTTTTACGAGAGTCCTTTCACCGGAGAAAC^ 

CCGCCAACCTCGCCGCCTTTGTTGATGGCTCCGCTAGGGAAAAAGCAAGCCGTTGATGGA 
GGACACCGACGGAAGAAGGATGTTTCTTCACCGGAGTCTGGTGGCGCAGAGGAGAGACGG 
TGTCTCCACTGCGCCACGGATAAGACTCCGCAATGGCGGACAGGCCCAATGGGCCCGAAG 
ACGTTGTGCAACGCTTGCGGTGTTAGGTACAAATCGGGACGTTTAGTGCCGGAGTATCGG 
CCCGCGGCGAGTCCGACGTTTGTGCTGGCGAAACACTCAAATTCTCATCGGAAAGTTATG 
GAGCTCCGGCGACAGAAGGAGATGAGTAGGGCCCATCATGAGTTCATACATCACCATCAC 
GGTACGGACACTGCCATGATTTTCGACGTTTCATCGGACGGTGA1X3ATTACTTGATCCAC 
CACAACGTTGGC C CAGATTTCAGACAGCTTATTTGA 

>G1488 Amino Acid Sequence (domain in AA coordinates: 221-246) 
MEDEAHEFFHTSDFAVDDLLVDFSNDDDEENDWADSTTTTTITDSSNFSAADLPSFHGD 
VQDGTSFSGDLCIPSDDLADELEWLSNIVDESLSPEDVHKLELISGFKSRPDPKSDTGSP 
ENPNS SS PI FTTDVS VPAKARSKRSRAAACNWASRGLLKETFYDSPFTGETILSSQQHLS 
PPTSPPLLMAPIX3KKQAVDGGHRRKKDVSSPESGGAEERRCLHCATDKTPQWRTGPMGPK 
TLCNACGVRYKSGRLVPE YRPAAS PTFVLAKHSNSHRKVMELRRQKEMSRAHHEF IHHHH 
GTDTAMI FDVSSDGDD YL IHHNVGPDFRQLI * 
>G1499 (159.. 833) 

CCTTTAATATATATAtTATGCTCACACACACACATATATATATACATATAA 

AAGCATTAAAATTTTTACGAACCAAACAAACAAAAATTATGAATAATTATAATATGAACC 

CATCTCTCTTCCAAAATTACACTTGGAACAACATCAT 

AGAATGATGATCATCATCATCAACATAATAATGATCCAATCGGTATGGCCATGGACCAGT 
ACACACAGCTCCATATCTTCAATCCTTTCTCTTCTTCTCATTTCCCTCCTCTCTCTTCTT 
CCCTCACAACCACCACTCTTCTCTCCGGAGATCAAGAAGACGACGAAGACGAAGAAGAAC 
CTCTAGAGGAACTCGGTGCTATGAAGGAAATGATGTACAAGATCGCAGCCATGCAATCGG 
TTGACATCGACCCAGCAACCGTCAAGAAACCCIAAACGCCGTAACGTGAGGATCTCCGACG 
ACCCTCAGAGTGTGGCGGCTAGACATCGCCGTGAGAGAATCAGTGAGAGGATCAGAATTC 
TTCAGAGACTCGTGCCAGGTGGCACTAAAATGGATACGGCTTCAATGCTCGATGAAGCTA 
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TACGCTATGTCAAGTTCTTGAAACGGCAGATCCGGCTACTCAATAATAATACCGGATATA 
CTCCTCCGCCGCCGCAAGATCAAGCITCTCAGGCGGTGACGACGTCATGGGTTTCACCGC 
CACCACCGCCAAGTTTCGGCCGTGGGGGAAGAGGAGTAGGAGAATTAATCTAGACAAGAT 
GACATTTCCATTAGTAGTAACTAAATTATGCTATAATGTGTGAGTAATGGTGCAATTATG 
GA 

>G1499 Amino Acid Sequence (domain in AA coordinates: 118-181) 
MNNYNT^PSLFQNYTWl^IINSSNNNNK^ 

HFPPLSSSLTTTTLLSGDQEDDEDEEEPLEELGAMKEMMYKIAAMQSVDIDPATVKKPKR 
RNVRISDDPQSVAARHRRERISERIRILQRLVPGGTKITOTAS^^ 
LNNNTG YTP PPPQDQASQAVTTSWVS PPP PPS FGRGGRGVGELI * 
>G1543 (1..828) 

ATGATAAAACTACTATTTACGTAOVTATGCACATACACATATAAACTATATGCTCTATAT 
CATATGGATTACGCATGCGTGTGTATGTATAAATATAAAGGCATCGTCACGCTTCAAGTT 
TGTCTCTTTTATATTAAACTGAGAGTTTTC 

CTAGCTCTTAAGAACCCTAATAATTC^TTGATC7U\AATAATGGCGATTTTGCCGGAAAAC 
TCTTCAAACTTGGATCTTACTATCTCCGTTCCAGGCCT^ 

GAAGGAAGTGGCGGAGGAAGAGACCAGCTAAGGCTAGACATGAATCGGTTACCGTCGTCT 

GAAGACGGAGACGATGAAGAATTCAGTCACGATGATGGCTCTGCTCCTCCGCGAAAGAAA 

CTCCGTCTAACCAGAGAACAGT(^CGTCTTCTTGAAGATAGTTTCAGACAGAATCATACC 

CTTAATCCCAAACAAAAGGAAGTACTTGCCAAGCATTTGATGCTACGGCCAAGACAAATT 

GAAGTTTGGTTTCAAAACCGTAGAGCAAGGAGCAAATTGAAGCAAACCGAGATGGAATGC 

GAGTATCTCAA7VAGGTGGTTTGGTTCATTAACGGAAGAAAACCACAGGCTCCATAGAGAA 

GTAGAAGAGCTTAGAGCCATAAAGGTTGGCCCAACAACGGTGAACTCTGCCTCGAGCCTT 

ACTATGTGTCCTCGCTGCGAGCGAGTTACCCCTGCCGCGAGCCCTTCGAGGGCGGTGGTG 

CCGGTTCCGGCTAAGAAAACGTTTCCGCCGCAAGAGCGTGATCGTTGA 

>G1543 Amino Acid Sequence (domain in AA coordinates: 135-195) 

M I KLLFTYI CTYTYKLYALYHMDYACVCMYKYKG I VTLQVCLF YI KLRVFLSNFTFSS S I 

IJ^KNPNNSLIOMAILPENSS^DIiTISVPGFSSSPLSDEGSGGGRDQLRLDMNRLPSS 

EDGDDEE FSHDDGSAPPRKKLRLTREQSRLLEDS FRQNHTLNPKQKEVLAKHLMLRPRQI 

EWFQNRRARSKLKQTEI^CEYLKRWFGSLTEENHRLHREVEELRAIKVGPTTVNS 

TMCPRCERVTPAAS PSRAWPVPAKKTFPPQERDR* 

>G1635 (1..1164) 

ATGGCGTCGTCTCCGTTGACTGCAAATGTTCAGGGTACCAACGCTTCTTTGAGGAATAGA 
GATGAAGAAACTGCAGACAAGCAGATACAATTCAATGACCAAAGTTTTGGGGGAAATGAC 
TATGCACCCAAGGTACGGAAGCCATACACGATAACAAAAGAGAGAGAGAGATGGACAGAT 
GAAGAGC^CAAGAAGTTTGTTGAAGCCTTGAAATTATACGGGCGAGCTTGGAGACGAATA 
GAAGAACATGTGGGCTCAAAGACCGCAGTTCAGATTCGAAGCCATGCTCAGAAGTTTTTC 
TCTAAGGTTGCTCGAGAAGCAACTGGAGGTGATGGGAGCTCAGTAGAGCCGATTGTAATA 
CCTCCTCCTCGTCCCAAGAGAAAGCCAGCGCATCCGTACCCTCGTAAGTTTGGGAACGAG 
GCAGATCAAACAAGTAGATCGGTTTCTCCCTCAGAACGTGATACTCAATCTCCAACCTCT 
GTGTTGTCCACTGTTGGATC^GAAGCATTGTGTTCCCTTGATTCGAGTTCACCCAATCGA 
AGCTTGTCCCCAGTTTCTTCTGCATCACCACCAGCTGCTCTTACAACCACTGCAAATGCA 
CCTGAAGAGCTTGAGACTCTGAAGCTGGAGTTGTTTCCTAGTGAGAGACTCTTAAACAGG 
GAGAGCTCGATCAAGGAACCAACGAAGCAAAGTCTTAMCTCTTTGGGAAGACAGTTTTG 
GTATCTGATT(^GGCATGTCCTCTTCTCTAACAACTTCAACATATTGTAAATCCCCAATT 
CAGC CATTACCACGGAAACTCTCATC ATC CAAG AC ACTACCCATAATAAGAAACTCACAA 
GAAGAACTCTTGAGCTGCTGGATACAAGTCCCTCTTAAGCAAGAAGATGTGGAAAATAGA 
TGTTTGGATTCAGGAAAGGCTGTCCAAAACGAAGGATCATCGACTGGATCAAACACTGGT 
TCGGTGGATGATACGGGACACACGGAAAAGACCACAGAACCCGAAACAATGCTATGTCAA 
TGGGAGTTTAAACCAAGTGAGAGGTCTGCATTTTCTGAGCTCAGAAGAACAAACTCCGAG 
TCAAATTCAAGAGGATTTGGTCCATACAAGAAGAGAAAGATGGTAACAGAAGAAGAAGAG 
CATGAGATTCATCTCCACTTATAA 

>G1635 Amino Acid Sequence (domain in AA coordinates: 44-104) 
MAS S PLTANVQGTNASLRNRDEETADKQI QFNDQS FGGNDYAPKVRKP YTI TKERERWTD 
EEHKKFVEALKLYGRAWRRIEEHVGSKTAVQIRSHAQKFFSKVAREATGGDGSSVEPIVI 
PPPRPKRKPAHPYPRKFGNEADQTSRSVSPSERDTQSPTSVLSTVGSEALCSLDSSSPNR 
SLSPVSSASPPAALTTTANAPEELETLKLELFPSERLLNRESSIKEPTKQSLKLFGKTVL 
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VSDSGMSSSLTTSTYCKSPIQPLPRKLSSSKTLPIIRNSQEELLSCWIQVPLKQEDVENR 
CLDSGKAVQNEGSSTGSNTGSVDDTGHTEKTTEPET^ 
SNSRGFGPYKKRKMVTBEEEHEIHLHL* 
>G1794 (160.. 1335) 

TCTTTCTTTCTTCCTCTTTGTCTCTGTTTCTTGTTTCTCTCTCTCTCTCTCTA 

TTCTTTCCCTCGAAGAAAAAGAATATTTTT7KAATTTAATTTTCTCTGCGTTTATAAGCTT 

TAAGTTTCAGAGGAGGAGGATTTAGAAGGAGGGTTTTGTATGTGTGTCTTAAAAGTGGCA 

AATCAGGAAGATAACGTTGGCAAAAAAGCCGAGTCTATTAGAGACGATGATCATCGGACG 

TTATCTGAAATCGATCAATGGCrTTTACTTATTCG CAGCCGAAG ACX3AC CAC CACCGTCAT 

AGCTTCCCTACGCAGCAGCCGCCTCCATCGTCGTCGTCCTCATCTCTTATCTCAGGTTTC 

AGTAGAGAGATGGAGATGTCTGCTATTGTCTCTGCTTTGACTCACGTTGTTGCTGGAAAT 

GTTCCTCAGCATCAACAAGGAGGCX^TGAAGGTAGCGGAGAAGGGACTTCGAATTCGTCT 

TCTTCCTCGGGGCAGAAAAGGAGGAGAGAGGTGGAGGAAGGTGGCGCCAAAGCGGTTAAG 

GC^GCTAATACTTTGACGGTTGATCAATATTTCTCCGGTGGTAGCTCTACTTCTA 

AGAGAAGCTTCGAGTAACATGTCAGGTCCGGGCCCAACATACGAGTATACAACTACGGCA 

ACTGCTAGTAGCGAAACGTCGTCGTTTAGTGGGGACCAACCTCGGCGAAGATACAGAGGA 

GTTAGACAAAGACCATGGGGAAAGTGGGCGGCTGAGATTCGAGATCCATTTAAAGCAGCT 

AGAGTTTGGCTCGGTACGTTCGACAATGCTGAATCAGCAGCAAGAGCTTACGACGAAGCT 

GCACTTCGGTTTAGAGGCAACAAAGCCAAACTCAACTTCCCTGAAAACGTCAAACTCGTT 

AGACCTGCTTCAACCGAAGCACAACCTGTGCACCAAACCGCTGCTCAAAGACCGACCCAG 

TCAAGGAACTCGGGTTCAACGACTACCCTTTTGCCCATAAGACCTGCTTCGAATCAAAGC 

GTTCATTCGCAGCCGTTGATGCAATCATAC^CTTGAGTTACTCTGAAATGGCTCGTCAA 

CAACAACAGTTTCAGCAACATCATCAACAATCTTTGGATTTATACGATCAAATGTCGTTT 

CCGTTGCGTTTCGGTCACACTGGAGGTTCAATGATGCAATCTACGTCGTCATCATCATCT 

CATTCTCGTCCTCTGTTTTCCCCGGCTGCTGTTCAGCCGCCACCAGAATCAGCTAGCGAA 

ACCGGTTATCTCCAGGATATACAATGGCCATCAGACAAGACTAGTAATAACTACAATAAT 

AGTC(^TCCTCCTGATGACTTGCTTCATTTTATTTGTTTCACTATAGAGTAATAGAAAAC 

AGGAAAATGATTATATGTTATAGAGTTATTTTTCCAAATATTATAGGGTTTAGGTTGTT^ 

GTATTGTTCTGCTTTCATCCTCTCATGCTTTTTTTCTTAATTTATTATATTTTTGCATTA 

TAATTTCGTTTCATTGTAACAAACATTAAAAAGACCACATGGAGAAAGGAAAAAAAAGAG 

AG 

>G1794 Amino Acid Sequence (domain in AA coordinates: TBD) 
MCVLKVANQEDNVGKKAES IRDDDHRTLS E IDQWL YLFAAEDDHHRHS FPTQQPPPS S S S 
SSLISGFSREMEMSAIVSALTHWAGNVPQHQQGGGEGSGEGTSNSSSSSGQKRRREVEE 
GGAKAVKAANTLTVDQYFSGGSSTSKVREASSNMSGPGPTYEYTTTATASSETSSFSGDQ 
PRRRYRGVRQRPWGKWAAE IRDP FKAARWLGTFDNAESAARAYDEAALRFRGNKAKLNF 
PENVKLWPASTEAQPVHQTAAQRPTQSRNSGSTTTLLPIRPASNQSVHSQPLMQSYNLS 
YSEMARQQQQFQQHHQQSLDLYDQMSFPLRFGHTGGSMMQSTSSSSSHSRPLFSPAAVQP 
PPESASETGYLQDIQWPSDKTSNNYNNSPSS * 
>G1839 (38.. 592) 

ATCACAGTTATGTTTCCATTCATTGGCTATAAAAACCATGCTCACTCCCTTTTGTTCTTC 
ACACCATTTGCAGGAAAAAATGAATAGTTGTGAG 

AGAAAATGTTCTATTTAATGATCAAAACGAAAATTTCACACTTGTTGCAC 

TTCTTCGTACTTGACAAGAGATCAAGAGCACGAGATCATGGTCTCTGCTCTGCGACAAGT 

GATATCTAACTCCGGAGCTGACGACGCGTCATCATCAAACTTGATCATCACAAGCGTTCC 

GCCTCCAGACGCTGGCCCTTGTCCTCTCTGTGGCGTCGCCGGTTGCTACGGCTGCACATT 

ACAACGGCCGCACCGAGAGGTAAAGAAGGAGAAGAAATACAAAGGAGTAAGGAAAAAACC 

ATCGGGTAAATGGGCGGCGGAGATATGGGATCCGAGATCGAAATCAAGGAGGTGGCTTGG 

AACGTTTCTTACGGCGGAGATGGCGGCACAATCTTACAATGATGCGGCGGCTGAGTATCG 

AGCAAGACGTGGTAAAACAAACGGAGAAGGAATTAAACGGCGGTGGAGATGACTGAGAAG 

GACATGGTCGGTGATCATACACGGCGAGGTGGAAATGTTATATTTACTATTGAAAACTAA 

ATTATTTATTATAGAGGGAGATATTACTCTTTACGCTTTCATTAAGATTTATTTTTATAA 

GTTTTAAAGTATTTTATTGTTATAAAAAAAAAAAAAAAAAAAAAA 

>G1839 Amino Acid Sequence (domain in AA coordinates: TBD) 

MLTPFCSSHHLQEKMNSCQSNPTKMDNSENVLFNDQNENFTLVAPHPSSSYLTRDQEHEI 

MVSALRQVISNSGADDAS SSNLI ITSVPPPDAGPCPLCGVAGCYGCTLQRPHREVKKEKK 

YKGVRKKPSGKWAAE I WDPRS KSRRWLGTFLTAEMAAQS YNDAAAEYRARRGKTNGEG I K 
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RRWR* 

>G2108 (35.. 694) 

GAGAGAGAAACATTGATCTCTGAATATTGTGAACATGTTGAAATCAAGTAACAAGAGAAA 
AAGCAAAGAAGAGAAGAAGTTACAAGAAGGGAAGTACCTTGGAGTGAGGAGACGTCCATG 
GGGAAGATATGCAGCTGAAATCAGAAACCCTTTTACTAAAGAAAGACATTGGCTTGGAAC 
GTTTGATACAGCCGAAGAAGCTGCTTTTGCATATGACGTTGCTGCTCGATCCATCAGCGG 
CTCTCTAGCTACAACAAACTTCTTCTACACTGAAAACACCTCTTTAGAAAGACATCCACA 
ACAGTCTTTGGAGCCTCATATGACTTGGGGATCTTCTAGTCTCTGTCITCTTCAAGAT^ 
GCCTTTTGAAAACAACCATTTTGTTGCTGATCCTATCTCTTCTTCTTTTTCTCAAAAACA 
AGAGTCTTCTACCAATCTCACTAACACTTTCT 

TGGCCAAAGCAAAGAGATTTCTTTACCTAATGATATGTCAAACAGTTTATTCGGTCATC 

GGACAAAGTCGGTGAACATGACAATGCAGACCATATGAAGTTTGGCTCAGTTCTCAGCGA 

CGAACCTCTCTGCTTTGAGTATGACTACATTGGGAATTATCTTCAGAGTTTTCTCAAAGA 

TGTCAACGACGATGCTCCACAGTTTCTTATGT^ 

TG 

>G2108 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MLKSSNK3^SKEEKKLQEGKYLGVRRR^ 

DVAARSISGSLATTNFFYTENTSLERHPQQSLEPHM^ 

ISSSFSQKQESSTNIjTNTFSHCYNDGDHVGQSKEISLPNDMSNSLFGHQDKVGEHDNADH 

MKFGSVLSDEPLCFEYDYIGNYLQSFLKDVNDDAPQFLM* 

>G2291 (27.. 797) 

GCTTTCTCACCTTTATAAAATAGAAAATGGAAAACAGCTACACCGTTGATGGTCACCGTC 

TTC^TATTCCGTTCCGTTAAGCTCCATGC^TGAAACCAGTCAAAACTCCGAAACTTACG 

GATTATCCAAAGAGTCGCCGTTGGTCTGCATGCCCTTGTTCGAAACCAACACTACT 

TCGATATCTCTTCTCTTTTCTCGTTTAACCC^AAACCAGAACCCGAAAACACGCATCGTG 

TCATGGACGATTCCATCGCCGCCGTCGTGGGCGAAAACGTTCTTTTCGGTGATAAAAACA 

AAGTCTCTGATCACTTGACCAAAGAAGGTGGTGTGAAGCGGGGGCGGAAGATGCCGCAGA 

AGACCGGAGGATTCATGGGAGTGAGAAAACGGCCGTGGGGGAGATGGTCGGCGGAGATAA 

GAGACAGGATAGGGCGGTGCAGACACTGGTTAGGAACGTTCGACACGGCGGAAGAGGCAG 

CGCGTGCGTATGACGCGGCGGCGAGGAGGCTTAGAGGGACCAAAGCCAAGACCAATTTCG 

TGATTCCTCCGCTTTTTCCCAAGGAT^ATAGCTCAGGCTCAGGAGGATAATAGGATGAGGC 

AGAAGCAGAAGAAGAAGAAGAAGAAAAAAGTGAGTGTGAGGAAGTGTGTTAAAGTCACAT 

CGGTTGCACAGTTGTTCGATGATGCCAATTTTATAAATTCTTCTAGTATTAAAGGAAATG 

TGATTAGTTCTATTGATAATCTTGAAAAAATGGGTCTAGAGCTTGATTTGAGTTTAGGGT 

TGTTGTCTAGGAAGTGATAAAGCACTCGTAGTTAAGTAGTTGTAGTT 

>G2291 Amino Acid Sequence (domain in AA coordinates: TBD) 

MENS YTVDGHRLQ YS VPLS SMHETSQNSET YGLS KES PLVCMPLFETNTTS FD I S SLFS F 

NPKPEPENTHRVMDDSIAAWGENVLFGDOTKVSDHLTKEGGVKRGRKMPQKTGGFM 

KRP WGRWS AE I RDRI GRCRHWLGTFDTAEE AARAYDAAARRLRGTKAKTNFVI PPLFPKE 

I AQAQEDNRMRQKQKKKKKKKVS VRKCVKVTSVAQLFDDANFINS SS I KGNVI S S IDNLE 

KMGLELDLSLGLLSRK* 

>G2452 (1..804) 

ATGTCATCGTCGACGATGTACAGAGGAGTTAATATGTTTTCACCGGCAAACACAAACTGG 

ATTTTTCAAGAAGTCAGAGAAGCCACGTGGACGGCGGAGGAGAAC^AACGGTTCGAGAAA 

GCTCTCGCTTATCTGGACGACAAAGACAATCTTGAGAGCTGGTCCAAGATCGCAGATTTG 

ATTCCCGGCAAAACAGTAGCTGACGTCATTAAACGATACAAGGAGCTAGAGGATGATGTC 

AGCGACATCGAAGCCGGACTTATCCCCATTCCGGGATACGGCGGCGACGCCTCCTCCGCT 

GCAAACAGTGACTA5TTCTTTGGTCTAGAAAACTCCAGCTACGGTTATGATTACGTCGTT 

GGAGGAAAGAGGAGTTCGCCGGCGATGACItSATTGTTTTAGGTCTCCGATGCCGGAAAAG 

GAGAGGAAGAAAGGAGTTCCGTGGACCGAGGACGAACACCTACGATTTCTGATGGGTTTG 

AAGAAATATGGAAAAGGAGATTGGAGAAACATAGCAAAAAGCTTTGTGACGACTCGAACG 

CCGACGCAAGTCGCTTCAC^CGCrC^GAAATATTTTCTTCGACAACTCA 

GACAAAAGACGATCAAGTATTCACGATATCACCACTGTTAACATCCCTGACGCAGACGCA 

TCCGCAACCGCCACGACCGCTGACGTAGCACTCTCTCCTACTCCAGCCAATTCTTTTGAC 

GTTTTCCTTCAGCCAAATCCTCATTACAGTTTCGCGTCTGCGTCTGCGTCTAGCTATTAT 

AATGCGTTTCCGCAGTGGAGTTAA 

>G2452 Amino Acid Sequence (conserved domain in AA coordinates : 27-213) 
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MSSSTMYRGVNMFSPANTNWIFQEVREATWTAE 

IPGKTVADVT KRYKELEDDVSDIEAGLI PI PGYGGDASSAANSDYFFGLENS S YGYDYW 
GGKRSSPAMTDCFRSPMPEKERKKGVPWTEDEHLRFLMGLKKYGKGDWRNIAKSFVTTRT 
PTQVASHAQKYFLRQLTDGKDKRRSS IHDITTVNI PDADASATATTADVALS PTPANS FD 
VFLQPNPHYS FASASASS YYNAFPQWS * 
>G2509 (143.. 934) 

ATATATTCCCTCTTTCATTCTCCTTCTTCGTCTTTTCTTTGTTTCTCATATTCA^ 

CCTCAATTCCAAATCTTAAACCCTAAATTTACAGACACAATCGAGATCACCTGAAAAAAG 

AGGTTTAAAGATTTTAG CAAAGATGG CGAATTCAGG AAATTATGG AAAGAGGCCCTTTCG 

AGGCGATGAATCGGATG AAAAGAAAGAAGCCGATGATGATGAG AACATATTC C CTTTCTT 

CTCTGCCCGATCCCAATATGACATGCGTGCCATGGTCTCAGCCTTGACTCAAGTCATTGG 

AAACCAAAGCAGCTCTCATGATAATAACCAACATCAACCTGTTGTGTATAATCAACAAGA 

TCCTAACCCACCGGCTCCTCCAACTCAAGATCAAGGGCTATTGAGGAAGAGGCACTATAG 

AGGGGTAAGACAACGACCATGGGGAAAGTGGGCAGCTGAAATTCGGGATCCGCAAAAGGC 

AGCACGGGTGTGGCTCGGGACATTTGAGACTGCTGAAGCTGCGGCTTTAGCTTATGATAA 

CGCAGCTCTTAAGTTCAAAGGAAGCAAAGCCAAACTCAATTTCCCTGAGAGAGCTCAACT 

AGCAAGTAACACTAGTACAACTACCGGTCCACCAAACTATTATTCTTCTAATAATCAAAT 

TTACTACTCAAATCCGCAGACTAATCCGCAAACCATACCTTATTTTAACCAATACTACTA 

TAACCAATATCTTCATCAAGGGGGGAATAGTAACGATGCATTAAGTTATAGCTTGGCCGG 

TGGAGAAACCGGAGGCTCAATGTATAATCATCAGACGTTATCTACTACAAATTCTTCATC 

TTCTGGTGGATCTTCAAGGCAACAAGATGATGAACAAGATTACGCCAGATATTTGCGTTT 

TGGGGATTCTTCACCTCCTAATTCrGGTTTTTGAGATCTTCAATAAACTGATAATAAAGG 

ATTTGGGTCACTTGTTATGAGGGGATCATATGTTTTCTAA 

>G2509 Amino Acid Sequence (domain in aa coordinates: 89-156) 

MANSGNYGKRPFRGDESDEKKEADDDENIFPFFSARSQYDMRAMVSALTQVIGNQSSSHD 

NNQHQPWYNQQDPNP PAP PTQDQGLLRKRHYRGVRQRPWGKWAAE IRDPQKAARVWLGT 

FETAEAAALAYDNAALKFKGSKAKIjNFPERAQIASNTSTTTGPPNYYSSNN 

NPQTIPYFNQYYYNQYLHQGGNSNDALSYSLAGGETGGSMYNHQTIiSTTNSSSSGGSSRQ 

QDDEQDYARYLRFGDSSPPNSGF* 

>G390 (1..2526) 

ATGATGGCTCATCACTCCATGGACGATAGAGACTCTCCTGATAAAGGATTTGATTCCGGC 
AAGTACGTTAGATACACGCCGGAACAAGTTGAAGCTCTTGAGAGAGTTTATGCTGAGTGT 
CCTAAACCTAGCTCTCTGAGAAGACAACAGCTTATTCGTGAATGTCCCATTCTCTGTAAC 
ATCGAGCCTCGACAGATCTU^GTTTGGTTCCAGAATCGCAGATGTCGAGAGAAGCAGAGG 
AAAGAGTCAGCTCGTCTTCAGACAGTGAACAGGAAGCTGAGTGCTATGAACAAGCTTTTG 
ATGGAAGAGAATGATCGTTTGCAGAAGCAAGTCTCCAACTTGGTTTATGAGAATGGATTC 
ATGAAACATCGAATCC^CACTGCTTCTGGGACGACC^CAGACAAC^GCTGTGAGTCTGTG 
GTCGTGAGTGGTCAGCAACGTCAGCAGCAAAACCCAACACATCAGCATCCTCAGCGTGAT 
GTTAACAACCCAGCTAATCTTCTCTCGATTGCGGAGGAGACCTTGGCGGAGTTCCTTTGC 
AAGGCTACAGGAACTGCTGTCGACTGGGTCCAGATGATTGGGATGAAGCCTGGTCCGGAX 
TCTATTGGTATCGTAGCTGTTTCACGCAACTGCAGTGGAATAGCAGCACGTGCCTGTGGC 
CTCGTGAGTTTAGAACCCATGAAGGTCGCTGAAATCCTCAAAGATCGTCCATCTTGGTTC 
CGTGACTGTCGATGTGTCGAGACTCTGAATGTTATACCCACTGGAAATGGTGGTACTATC 
GAGCTTGTCAACACTCAGATTTATGCTCCTACAACATTAGCAGCAGCTCGTGACTTTTGG 
ACGCTGAGATATAGTACAAGTCTAGAAGATGGAAGCTATGTGGTCTGTGAGAGATCACTC 
ACTTCTGCAACTGGTGGCCCCAATGGTCCACTTTCTTCAAGCTTCGTGAGAGCCAAAATG 
CTGTCAAGCGGGTTTCTTATCCGTCCTTGTGATGGTGGTGGTTCCATTATTCACATCGTT 
GATCATGTGGACTTGGATGTCTCAAGTGTTCCTGAAGTCCTCAGGCCTCTTTATGAGTCT 
TCCAAAATCCTTGCTCAAAAAATGACTGTCGCTGCTCTGAGACATGTGCGCCAAATTGCT 
CAAGAGACTAGTGGAGAAGTCCAGTATAGTGGTGGACGCCAGCCTGCAGTTTTAAGGACT 
TTCAGCCAGAGACTCTGCCGGGGTTTCAATGATGCTGTAAATGGTTTTGTCGATGATGGA 
TGGTCTCCAATGAGTAGTGATGGAGGAGAGGATATTACGATCATGATTAACTCTTCCTCT 
GCTAAATTTGCTGGCTCCCAATACGGTAGCTCATTTCTTCCAAGTTTTGGAAGTGGTGTC 
CTCTGTGCCAAAGCTTCTATGCTGTTGCAGAATGTTCCACCCCTTGTATTGATTCGGTTC 
CTGAGAGAACACCGAGCTGAATGGGCAGACTATGGTGTCGATGCCTATTCTGCTGCATCT 
CTCAGAGCAACTCCATATGCTGTTCCATGCGTCAGAACCGGTGGGTTCCCGAGTAACCAA 
GTCATTCTTCCTCTCGCACAGACACTCGAACATGAAGAGTTTCTCGAAGTGGTTAGACTT 
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GGAGGTCATGCTTACTCACCTGAAGACATGGGCTTATCCCGGGATATGTATTTACTGCAG 
CTTTGTAGCGGCGTTGATGAAAATGTGGTTGGAGGTTGTGCTCAGCTTGTCTTTGCCCCA 
ATCGATGAATCATTTGCTGATGATGCACCTTTGCT 

CTCGACCAAAAAACAAATCCGAATGATCATO^TCTGCAAGTCGAACACGGGATCTAGCA 
TCGTCCCTAGATGGTTCCACCAAAACCGATTCGGAAACAAACTCTAGATTGGTCTTAACA 
ATAGCCTTCCAGTTCACGTTTGATAACCATTCCAGAGACAATGTTGCTACAATGGCGAGA 
CAGTATGTGAGGAACGTTGTTGGTTCGATTCAGAGAGTGGCTCTAGCCATTACGCCTCGT 
CCIX^CTCAATGCAACTTCCCACTTCCCCTGAAGCTCTCACTCTTGTCCGTTGGATCACC 
TOTAGTTACAGTATTCATACAGGTGCAGATCTGTTTGGAGCTGATTCTCAGTCCTGTGGA 
GGAGACACATTGCTTAAGCAACTCTGGGACCATAGTGATGCCATATTGTGCTGCTCCCTG 
AAAACTAATGCCTCACCGGTATTCACATTTGCAAACCAAGCTGGTTTAGACATGCTTGAA 
ACTACACTTGTGGCACTTCAGGATATAATGCTCGACAAAACACTTGATGACTCTGGTCGT 
AGAGCTCTTTGCTCCGAGTTCGCCAAGATCATGCAGCAGGGATATGCGAATCTTCCGGCA 
GGAATATGTGTGTCGAGCATGGGCAGACCGGTTTCGTATGAGCAAGCGACGGTGTGGAAA 
GTTGTTGATGACAACGAATCAAACCACTGCTTGGCTTTTACCCrrCGTTAGTTGGTCGTTT 
GTTTGA 

>G390 Amino Acid Sequence (domain in AA coordinates: 18-81) 
MMAHHSMDDRDSPDKGFDSGKYWYTPEQVEALERVYAECPKPSSLRRQQLIRECPILCN 
IEPRQIKWFQNRRCREKQRKESARLQTVNRKLSAMNKL^^ 
MKHRIHTASGTTTDNSCESVWSGQQRQQQNPraQHPQ 

KATGTAVDWVQMIGMKPGPDSIGIVAVSRNCSGIAARACGLVSLEPMKVAEILKDRPSWF 
RDCRCVETLNVIPTGNGGTIELVNTQIYAPTTLA^^ 

TS ATGGPNGPLS S S FVRAKML S SGFL IRPCDGGGS I IHI VDHVDIiDVS SVPEVLRPLYES 

SKILAQKMTVAALRHTOQIAQETSGEVQYSGGRQPAVLRTFSQRLCRGFNDAVNGFVDDG 

WSPMSSDGGEDITIMINSSSAKFAGSQYGSSFLPSFGSGVLCAKASMLLQNVPPLVLIRF 

LREHRAEWADYGVDAYSAASLRATPYAVPCVRTGGFPSNQVILPLAQTLEHEEFLEWRL 

GGHAYS PEDMGLSRDMYLLQLCSGVDENWGGCAQLVFAP IDES FADDAPLLPSGFRVI P 

LDQKTNPNDHQSASRTRDLASSLIX3STKTDSETNSRLVLTIAFQFTFD 

QYVRNWGS I QRVALAI TPRPGSMQLPTS PE ALTLVRW I TRS YS IHTGADLFGADSQS CG 

GDTLLKQLWDHSDAI LCCSLKTNAS PVFTFANQAGLDMLETTLVALQD IMLDKTLDDSGR 

RALCSEFAKIMQQGYANLPAGICVSSMGRPVSYEQATWKVVDDNESI^ 

V* 

>G391 (1..2559) 

ATGATGATGGTCCATTCGATGAGCAGAGATATGATGAACAGAGAGTCGCCGGATAAAGGG 
TTAGATTCCGGCAAGTATGTGAGGTACACGCCGGAGCAAGTGGAAGCTCTCGAGAGAGTT 
TACACTGAGTGTCCTAAGCGAAGTTCTCTAAG^^GACAACAACTCATACGTGAATGTCCG 
ATTCTCTCTAACATCGAGCCTAAGCAGATCAAAGTTTGGTTTCAGAACCGCAGATGTCGT 
GAGAAGCAGAGGAAAGAAGCTGCTCGTCTTCAAACAGTGAACAGAAAACTCAATGCCATG 
AACAAACTCTTGATGGAAGAGAATGATCGTTTGCAGAAGCAAGTTTCTAACTTGGTCTAT 
GAGAATGGC(^(^TGAAACATC^CTTCACACTGCTTCTGGGACGACCACAGACAACAGC 
TGTGAGTCTGTGGTCGTGAGTGGTCAGCAA^TO^CAGCAAAACCCAAATCCTCAGCAT 
CAGCAACGTGATGCTAACAACCCAGCAGGACTCCTTTCTATAGCAGAGGAGGCCCTAGCA 
GAGTTCCTTTCCAAGGCTACAGGAACTGCTGTTGACTGGGT^ 

CCTGGTCCGGATTCTATTGGCATAGTCGCTATTTCGCGCAACTGCAGCGQAATTGCAGCA 

CGTGCCTGCGGCCTCGTGAGTTTAGAACCCATGAAGGTTGCTGAAATTCTCAAAGATCGT 

CCATCTTGGCTCCGAGATTGTCGAAGTGTGGATACTCTGAGTGTGATACCTGCTGGAAAC 

GGTGGGACGATCGAGCTTATTTACACGCAGATGTATGCTCCTACGACTTTAGCAGCAGCT 

CGTGACTTTTGGACGCTGAGATATAGCACATGTTTGGAAGATGGAAGCTATGTGGTTTGT 

GAAAGGTCGCTTACTTCTGCAACTGGTGGCCCCACTGGGCCACCTTCTTCAAACTTTGTG 

AGAGCTGAAATGAAAGCAAGCGGGTTTCTCATCCGTCCTTGCGATGGTGGTGGTTCCATT 

CTCCACATTGTTGATCATGTTGATCTGGATGCCTGGAGTGTCCCTGAAGTCATGAGGCCT 

CTCTATGAATCATCGAAGATTCTTGCTCAGAAAATGACTGTTGCTGCTTTGAGACATGTA 

AGACAAATTGCACAAGAAACAAGTGGAGAAGTTCAGTATGGTGGAGGGCGCCAACCTGCG 

GTTTTAAGAACCTTCAGTCAAAGACTCTGTCGGGGTTTCAATGATGCTGTTAA 

GTGGATGATGGATGGTCACCAATGGGTAGCGATGGTGCAGAGGATGTTACTGTAATGATA 

AACTTGTCCCCTGGGAAGTTTGGTGGGTCTCAGTACGGTAATTCATTCCTTCCAAGCTTT 

GGTAGTGGCGTGCTTTGTGCCAAGGCATCTATGTTGCTTCAGAACGTTCCACCCGCTGTG 
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CTGGTTCGATTCCTTAGAGAACACCGATCTGAATGGGCTGATTATGGCGTGGATGCTTAT 

GCTGCTGCATCGCTCAGAGCAAGTCCTTTTGCn^TTCCTTGTGCTAGAGCTGGGGGGTTC 

CCAAGTAACCAAGTCATTCTTCCTCTTGCGCAGACAGTTGAACATGAAGAGTCACTTGAG 

GTGGTTAGACTTGAAGGTCACGCTTACTCACCCGAAGACATGGGTTTAGCTCGGGATATG 

TATTTGCTACAGCTTTGTAGCGGTGTTGATGAAAATGTGGTTGGAGGTTGTGCACAGCTT 

GTATTTGCCCCTATCGATGAATC7VTTTGCTGATGATGCACCTTTGCTTCCTTC 

CGCATCATACCTCTTGAACAGAAATCTACTCCGAACGGTGCATCTGCAAACCGTACCCTG 

GATTTAGCCTCAGCTTTAGAAGGATCCACACGTCAAGCTGGTGAAGCCGACCCAAATGGC 

TGTAACTTTAGGTCGGTACTAACCATAGCATTCCAGTTCACATTTC 

GACAGTGTTGCn?TCAATGGCACGTCAGTACGTGCGAAGCATAGTAGGATCGATTCAGAGG 

GTTGCTCTAGCCATTGCTCCTCGTCCTGGCTCCAATATCAGTCCAATATCTGTTCCCACT 

TCCCCTGAAGCTCTCACTCTGGTCCGTTGGATCTCCCGGAGTTACAGCCTTCACACTGGT 

GW^GATCTCTTTGGATCTGATTCTCAAACCAGTGGT 

AATCACTCTGATGCAATCTTGTGCTGCTCCC^ 

TTCGCAAACCAAACCGGTTTAGACATGCTGGAAACGACTCTTGTAGCCCTTCAAG 

ATGCTAGACAAGACCCTTGACGAACCTGGTCGTAAAGOTCTTTGCTCTGAGTTCCCCAAG 

ATCATGCAACAGGGCTATGCTCATCTGCCGGCAGGAGTATGTGCGTCAAGCATGGGAAGG 

ATGGTATCTTACGAGCAGGCAACGGTGTGGAAAGTTCTTGAAGACGATGAATCAAACCAC 

TGCTTAGCTTTCATGTTCGTGAATTGGTCGTTCGTTTGA 

>G391 Amino Acid Sequence (domain in AA coordinates: 25-85) 
MMMVHSMSRDMMNRESPDKGLDSGKYW 
ILSNIEPKQIKVWFQNRRCREKQRKEAARLQTVN^ 
ENGHMKHQLHTASGTTTDNSCESVWSGQQHQQ^^ 

EFLSKATGTAVDWQMIGMKPGPDSIGIVAISRNCSGIAARACGLVSIjEPMKVAEILKDR 

PSWLRDCRSVDTLSVT PAGNGGTI ELI YTQMYAPTTLAAARDFWTLRYSTCLEDGSYVVC 

ERSLTSATGGPTGPPSSNFVRAEMKPSGFLIRPCDGGGSILHIVDHVDLDAWSVPEVMRP 

LYESSKILAQKMTVAALRHVRQIAQETSGEVQYGGGRQPAVLRTFSQRLCRGFNDAVNGF 

VDDGWSPMGSDGAEDVTVMINLSPGKFGGSQYGNSFLPSFGSGVLCAKASl^LQNVPPAV 

LVRFLREHRSEWADYGVDAYAAASLRASPFAVPCARAGGFPSNQVI LPLAQTVEHEES LE 

VWLEGHAYSPEDMGLARDMYLLQLCSGVDENVVGGCAQLVFAPIDESFADDAPLLPSGF 

RIIPLEQKSTPNGASANRTLDLASALEGSTRQAGEADPNGCNFRSVLTIAFQFTFDNHSR 

DSVT^SMARQYVRSIVGSIQRVALAIAPRPGSNISPISVPTSPEALTLVRWISRSYSLHTG 

ADLFGSDSQTSGDTLLHQLWNHSDAILCCSLKTNASPVFTFANQTGLDMLETTLVALQDI 

MLDKTLDEPGRKALCSEFPKIMQQGYAHLPAGVCASSMGR1WSYEQATWKVLEDDESNH 

CLAFMFVNWSFV* 

>G438 (188.. 2716) 

CGGGGTACCCAAGCCACGACCGTAGAATCTTCT 

TTTCTCTTACGATACGACGGACTTTCCGAAGAAATTAATTTAAAGAGAAAAGAAGAAGAA 
GCCAAAGAAGAAGAAGAAGCTAGAAGAAACAGTAAAGTTTGAGACTTTTTTTGAGGGTCG 
AGCTAAAATGGAGATGGCGGTGGCTAACCACCGTGAGAGAAGCAGTGACAGTATGAATAG 
ACATTTAGATAGTAGCGGTAAGTACGTTAGGTACACAGCTGAGCAAGTCGAGGCTCTTGA 
GCGTGTCTACGCTGAGTGTCCTAAGCCTAGCTCTCTCCGTCGACAACAATTGATCCGTGA 
ATGTTCC^TTTTGGCCAATATTGAGCCTAAGCAGATCAAAGTCTGGTTTCAGAACCGCAG 
GTGTCGAGATAAGCAGAGGAAAGAGGCGTCGAGGCTCCAGAGCGTAAACCGGAAGCTCTC 
TGCGATGAATAAACTGTTGATGGAGGAGAATGATAGGTTGCAGAAGCAGGTTTCTCAGCT 
TGTCTGCGAAAATGGATATATGAAACAGCAGCTAACTACTGTTGTTAACGATCCAAGCTG 
TGAATCTGTGGTCACAACTCCTCAGCATTCGCTTAGAGATGCGAATAGTCCTGCTGGATT 
GCTCTCAATCGCAGAGGAGACTTTGGCAGAGTTCCTATCCAAGGCTACAGGAACTGCTGT 
TGATTGGGTTCAGATGCCTGGGATGAAGCCTGGTCCGGATTCGGTTGGCATCTTTGCCAT 
TTCGCAAAGATGCAATGGAGTGGCAGCTCGAGCCTGTGGTCTTGTTAGCTTAGAACCTAT 
GAAGATTGCAGAGATCCTCAAAGATCGGCCATCTTGGTTCCGTGACTGTAGGAGCCTTGA 
AGTTTTCACTATGTTCCCGGCTGGTAATGGTGGCACAATCGAGCTTGTTTATATGCAGAC 
GTATGCACCAACGACTCTGGCTCCTGCCCGCGATTTCTGGACCCTGAGATACACAACGAG 
CCTCGACAATGGGAGTTTTGTGGTTTGTGAGAGGTCGCTATCTGGCTCTGGAGCTGGGCC 
TAATGCTGCTTCAGCTTCTCAGTTTGTGAGAGCAGAAATGCTTTCTAGTGGGTATTTAAT 
AAGGCCTTGTGATGGTGGTGGTTCTATTATTCACATTGTCGATCACCTTAATCTTGAGGC 
TTGGAGTGTTCCGGATGTGCTTCGACCCCTTTATGAGTCATCCAAAGTCGTTGCACAAAA 
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AATGACCATTTCCGCGTTGCGGTATATCAGGCAATTAGCCCAAGAGTCTAATGGTGAAGT 

AGTGTATGGATTAGGAAGGCAGCCTGCTGTTCTTAGAACCTTTAGCCAAAGATTAAGC^ 

GGGCTTCAATGATGCGGTTAATGGGTTTGGTGACGACGGGTGGTCTACGATGCATTGTGA 

TGGAGCGGAAGATATTATCGTTGCTATTAACTCTACAAAGCATTTGAATAATATTTCTAA 

TTCTCTTTCGTTCCTTGGAGGCGTGCTCTGTGCCAAGGCTTCAATGCTTCTCCAAAATGT 

TCCTCCTGCGGTTTTGATCCGGTTCCTTAGAGAGCATCGATCTGAGTGGGCTGATTTCAA 

TGTTGATGCT^TATTCCGCTGCTACACTTAAAGCTGGTAGCTTTGCTTATCCGGGAATGAG 

ACCAACAAGATTCACTGGGAGTCAGATCATAATGCCACTAGGACATACAATTGAACACGA 

AGAAATGCTAGAAGTTGTTAGACTGGAAGGTCATTCTCTTGCTCAAGAAGATGCATTTAT 

GTCAC^GGATGTCCATCTCCTTCAGATTTGTACCGGGATTGACGAGAATGCCGTTGGAGC 

TTGTTCTGAACTGATATTTGCTCCGATTAATGAGATGTTCCCGGATGATGCTCCACTTGT 

TCCCTCTGGATTCCGAGTCATACCCGTTGATGCTAAAACGGGAGATGTACAAGATCTGTT 

AACCGCTAATCACCGTACACTAGACTTAACTTCTAGCCTTGAAGTCGGTCCATCACCTGA 

GAATGCTTCTGGAAACTCTTTTTCTAGCTCAAGCTCGAGATGTATTCTCACTATCGCGTT 

TCAATTCCCTTTTGAAAACAACTTGCAAGAAAATGTTGCTGGTATGGCTTGTCAGTATGT 

GAGGAGCGTGATCTCATCAGTTCAACGTGTTGCAATGGCGATCTCACCGTCTGGGATAAG 

CCCGAGTCTGGGCTCCAAATTGTCCCCAGGATCTCCTGAAGCTGTTACTCTTGCTCAGTG 

GATCTCTCAAAGTTACAGTCATCACTTAGGCTCGGAGTTGCTGACGATTGATTCACTTGG 

AAGCGACGACTCGGTACTAAAACTTCTATGGGATCACCAAGATGCCATCCTGTGTTGCTC 

ATTAAAGCCACAGCCAGTGTTCATGTTTGCGAACCAAGCTGGTCTAGACATGCTAGAGAC 

AACACTTGTAGCCTTACAAGATATAACACTCGAAAAGATATTCGATGAATCGGGTCGTAA 

GGCTATCTGTTCGGACTTCGCCAAGCTAATGCAACAGGGATTTGCTTGCTTGCCTTCAGG 

AATCTGTGTGTCAACGATGGGAAGACATGTGAGTTATGAACAAGCTGTTGCTTGGAAAGT 

GTTTGCTGCATCTGAAGAAAACAAGAACAATCTGCATTGTCTTGCCTTCTCCTTTGTAAA 

CTGGTCTTTTGTGTGATTCGATTGACAGAAAAAGACTAATTTAAATTTACGTTAGAGAAC 

TCAAATTTTTGGTTGTTGTTTAGGTGTCTCTGTTTTGTTTTTTAAAATTATTTTGATCAA 

A 

>G438 Amino Acid Sequence (domain in AA coordinates: 22-85) 

MEMAVANHRERSSDSMNRHLDSSGKYVRYTAEQVEALERVYAECPKPSSLRRQQLIRECS 

ILANIEPKQIKVWFQNRRCRDKQRKEASRLQSVNRKLSAMNK^ 

ENGYMKQQLTTWNDPS CE S WTTPQHSLRDANS PAGLL S IAEETLAEFLSKATGTAVDW 
VQMPGMKPGPDSVGIFAISQRCNGVAARACGLVSLEPMKIAEILKDRPSWFRDCRSLEVF 
TMFPAGNGGT IELVYMQTYAPTTIjAPARDFWTLRYTTSLDNGS FWCERSLS GSGAGPNA 
ASASQFVRAEMLSSGYLIRPCDGGGS I IHI VDHLNLEAWS VPDVLRPLYES S KWAQKMT 
ISALRYIRQLAQESNGEVVYGLGRQPAVLRTFSQRLSRGFNDAVNGFGDDGWSTMHCDGA 
EDIIVAINSTKHLNNISNSLSFLGGVLCAKAS^^ 

AYSAATLKAGSFAYPGMRPTRFTGSQIIMPLGHTIEHEEMLEWRLEGHSLAQEDAFMSR 

DVHLLQI CTG IDENAVGACSELI FAP INEMFPDDAPLVPSGFRVI PVDAKTGDVQDLLTA 

NHRTLDLTSSLEVGPSPENASGNSFSSSSSRCILTIAFQFPFENNLQENVAGMACQYVRS 

VISSVQRVAiyUVISPSGISPSLGSKLSPGSPEAVTLAQWISQSYSHHLGSELLTIDSLGSD 

DSVLKLLWDHQDAILCCSLKPQPVFMFANQAGLDMLETTLVALQDITLEICIFDESGRKAI 

CSDFAKLMQQGFACLPSGICTSTMGRHVSYEQAVAW^ 

FV* 

>G47 (38.. 472) 

CTTCTTCTTCACATCGATCATCATACAACAACAAAAAATGGATTACAGAGAATCCACCGG 
TGAAAGTCAGTCAAAGTACAAAGGAATCCGTCGTCGGAAATGGGGCAAATGGGTATCAGA 
GATTAGAGTTCCGGGAACTCGTGACCGTCTCTGGTTAGGTTCATTCTCAACAGCAGAAGG 
TGCCGCCGTAGCAC^CGACGTTGCTTTCTTCTGTTTACACCAACCTGATTCTTTAGAATC 
TCTCAATTTCCCTCATTTGCTTAATCCTTCACrCGTTTCCAGAACTTCTCCGAGATCTAT 
CCAGCAAGCTGCTTCTAACGCCGGCATGGCCATTGACGCCGGAATCGTCCACAGTACCAG 
CGTGAACTCTGGATGCGGAGATACGACGACGTATTACGAGAATGGAGCTGATCAAGTGGA 
GCCGTTGAATATTTCAGTGTATGATTATCTGGGCGGCCACGATCACGTTTGATTTATCTC 
GACGGTCATGATCACGTTTGATCTTCTTTTGAGTAAGATTTTGTACCATAATCAAAACAG 
GTGTGGTGCTAAAATCTTACTCAAAACAAGATTAGGTACCACAGAGAAACAATCAAATGG 
TTGTGAATATAC^TTATAAGGTTTTGATTAATGTTTGTTTCACTGATTTAGTGAAGTTTG 
GTCCATTGTATACAAATCTATTCAAGAAACCTAGCGCGAGATCATGTTTCGTGATTGAAG 
ATTGAGATTTTTAAGTATTCGTAATATTTTTGTAAAATACAAATAAAAAAAAAAAAAAAA 
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AAAAA 

>G47 Amino Acid Sequence (domain in AA coordinates: 11-80) 
MDYRESTGESQS KY KG IRRRKWGKWVSE I RVPGTRDRLWLGSFSTAEGAAVAHDVAFFCL 
HQPDSLESLNFPHLLNPSLVSRTSPRSIQQAASNAGMAIDAGIVHSTSVNSGCGDTTTYY 
ENGADQVEPLNI S VYD YLGGHDHV * 
>G559 (89. .1285) 

aaagttgctagctttaatttgccaacttactattcttatgtgtaataatcgtttgcaggg 
tcgttgatttggtgataagtcagtagaaATGgataaggagaaatctccagcacctccttg 
tggaggtcttcctcctccatctccatcaggtcgatgctctgcattctcagaagctggtcc 
cattggtcatggttcagatgctaatcgaatgagtcatgatattagccgtatgcttgataa 
cccacctaagaagattggacatcggcgagctcattctgaaatacttactctccctgatga 
tttgagctttgatagtgatcttggtgtggttggtaatgctgctgatggagcttctttctc 
tgatgagactgaagaagatttgctctctatgtatcttgatatggataagtttaattcttc 
tgctacatcttctgcccaagttggtgagccatcaggaactgcttggaaaaatgagacaat 
gatgcagacaggcacaggctcaacttccaatcctcagaatacggttaatagtcttggcga 
aaggccaagaatcaggcatcaacatagccaatctatggatggttcaatgaatatcaatga 
gatgcttatgtcgggaaatgaagatgattctgctattgatgctaagaagtctatgtctgc 
tactaaacttgctgagcttgctctcattgatcctaaacgtgctaagaggatatgggcaaa 
caggcagtccgcagcacgatcaaaagaaaggaagacgagatacatatttgagcttgagag 
aaaagtacagactttgcaaacagaggctacaactctctcagcccagttgaccctcttaca 
gagagacacaaatggcttgactgttgaaaacaatgagctgaagctgcggttacaaacaat 
ggagcagcaggttcacttgcaggatgaactaaacgaagcactaaaggaggaaatccagca 
tctgaaggtgttgactggccaagttgctccatcagcgttgaactatgggtcgtttggatc 
aaaccagcagcaattctattccaacaatcagtcaatgcaaacaatcttagctgcaaaaca 
gttccagcaacttcagattcattcacagaagcagcaacaacaacaacaacaacaacaaca 
gcaacaccaacagcagcagcagcaacagcaacagtatcagtttcaacagcaacagatgca 
acagcttatgcagcagcggcttcaacagcaagaacaacaaaatggagtaagactcaagcc 
ttcacaagcccagaaagagaacTGAggaatatgaatatgtcccacgtaagtgagaggttc 
tccttctgaacaattcctttctcattcataaattgttgttcatccatcacttgcagtctc 
ttggattttagggttttagctaacaca 

>G559 Amino Acid Sequence (domain in AA coordinates: 203-264) 

MDKEKSPAPPCGGLPPPSPSGRCSAFSEAGPIGHGSDANRMSHDISRMLDNPPKKIGHRR 

AHSEILTLPDDLSFDSDLGWGNAADGASFSDETEEDLLSMYLDMDKFNSSATSSAQVGE 

PSGTAWKNETMMQTGTGSTSNPQNTVNSLGERPRIRHQHSQSl^GSMNINEMLMSGNED^ 

SAIDAKKSMS ATKLAELAL I D PKRAKRI WANRQSAARS KERKTRYI FELERKVQTLQTE A 

TTLSAQLTLLQRDTNGLTVENNELKL^^ 

PSALNYGSFGSNQQQFYSNNQSMQTILAAKQFQQLQIHSQKQQQQQQQQQQQHQQQQQQQ 

QQYQFQQQQMQQLMQQRLQQQEQQNGVRLKPSQAQKEN* 

>G568 (141.. 995) 

GACCGGCTAAAGTCAAGAACCTCTCTCTGAGCTCTCACCACTTTCTCTCTCTACTCCCTC 
TCTGCGTGTAGGATACTACTAGACAATTGACAACCAAAGACTAAAGCTGTCTTGTTGGTT 
CACTTCTGTTCTCTTTTCCAATGTTGTCATCAGCTAAGCATCAGAGAAACCATAGA^ 
CTGCTACAAACAAGAACCAGACTCTCACCAAAGTTTC 

CGTCTTCTTCTTCATCATCATCAACCTCATCATCATCTCCTTTACCTTCTCAAGACTCTC 

AAGCCCAGAAGAGATCTCTTGTCACCATGGAAGAAGTTTGGAATGACATCAACCTTGCTT 

CCATCCACCACCTAAACCGACACAGCCCTCATCCACAACACAACCACGAGCCAAGGTTCA 

GGGGCCAAAACCACC^C^CCAAAACCCTAACTCAATCTTCCAAGATTTTCTCA^ 

CTTTGAACCAGGAACCAGCACCCACAAGCCAGACCACGGGTTCTGCGCCTAATGGCGATT 

CGACCACGGTCACTGTTCTTTACAGCTCTCCTTTTCCACCTCCTGCAACTGTTCTGAGCT 

TGAATTCCGGCGCTGGCTTCGAGTTTCTCGATAACCAAGATCCTCTTGTTACCTCAAACT 

CTAATCTTCATACCCACCATCACCTCTCAAACGCTCATGCCTTCAACACCTCTTTCGAGG 

CTCTGGTTCCATCCAGTTCTTTTGGTAAGAAAAGAGGCCAAGATTCCAATGAAGGTTCAG 

GGAATAGAAGACATAAGCGTATGATCAAGAACAGAGAATCTGCAGCTCGTTCCCGCGCTA 

GGAAACAGGCTTATACAAACGAGTTAGAACTTGAAGTTGCTCACTTGCAGGCAGAAAATG 

C^AGACTCAAGAGACAACAAGATCAAAAAATGGCTGCAGCAATTCAGCAACCCAAAAAGA 

ACACACTT(^CGGTCTTCCACAGCTCCATTTTGAGAAATCTA<^VAGTCCTTGTTTCTCT 

TTTGGGGATTGAGATTGTCTCATGAAGAAGTGAAAAAATGGCAAAAGTTTGTACCCTTTT 
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TTATTAGCTATAAGTATAACTAAGCCTAAAATTGTAGAACTAAGATATTGTAGGGGAAAA 
AAGAAGATGTAAAACAAAAGACCCGGAAAGAGAAAAGGATCTTTCAATTTCCTAAGGCAC 
AGGAACACCTGTCCTGGGTCCTCTCTTAATGTTCTO 
TTCACTTCTGTACTAACTTATACTTGTATTCTTG 

>G568 Amino Acid Sequence (domain in AA coordinates: 215-265) 

MLSSAKHQRNHMiSATNKNQTLTKVSSISSSSPSSSSSSSSTSSSSPLPSQDSQAQKRSL 

VTMEE VWNDINLAS IHHLNRHS PHPQHNHEPRFRGQNHHNQNPNS I FQDFLKGSLNQE PA 

PTSQTTGSAPNGDSTTVTVLYSSPFPPPATVLSLNSGAGFEFLDNQDPLVTSNSNLHTHH 

HLSNAHAFNTSFEALVPSSSFGKKRGQDSNEGSGNR^ 

ELELEVAHLQAENARLKRQQDQKMAAAIQQPKKNTLQRSSTAPF* 

>G580 (43.. 747) 

CCAAAAAACAAAGCATTCTATGCTATTCTGTTCTGT^ 

C^TAATAAGATCAACAACCATAGTGCCTTTTCAATTTCCTCTTCATCATCATCATTATCA 

ACATCATCCTCCCTAGGCCATAACAAATCTCAAGTCACCATGGAAGAAGTATGGAAAGAA 

ATCAACCTTGGTTCACTTCACTACCATCGGCAACTAAACATTGGTCATGAACCAATGTTA 

AAGAACCAAAACCCTAATAACTCCATCTTTCAAGATTTCCTCAACATGCCT 

CCACmCCACC^CCACC^CCACCTTCCTCTTCCACC^TTGTCACTGCTCTCTATGGCTCT 

CTGCCTCTTCCGCCTCCTGCCACTGTCCTCAGCTTAAACTCCGGTGTTGGATTCGAGTTT 

CTTGATACCACAGAAAATCTTCTTGCTTCTAACCCTCGCTCCTTTGAGGAATCTGCAAAG 

TTTGGTTGTCTTGGTAAGAAAAGAGGCCAAGATTCTGATGATACTAGAGGAGACAGAAGG 

TATAAGCGTATGATCAAGAACAGAGAATCTGCTGCTCGTTCAAGGGCTAGGAAGCAGGCA 

TATACAAACGAACTTGAGCTTGAAATTGCTCACTTGCAGACAGAGAATGCAAGACTCAAG 

ATACAACAAGAGCAGCTGAAAATAGCCGAAGCAACTCAAAACCAAGTAAAGAAAACACTA 

CAACGGTCTTCCACAGCTCCATTTTGAGAAAAATCTACTATTTCTTTTTGGGGGAGTTTC 

AAGTGTTTCTTATGAAGATGAGAAAAACAGAAAAAGTTTGTACATTTTAGCTAAGTTAAA 

TTTGTGGTGGTAAGTAATGTAAAAGAAAAGTGTGTGTAGAAGAAAAGTGTCTAGAAAAAG 

AAAGCAACTAACTTTCTTCTTCTTCTCTGGTTTCCTATCAACTCTTTTGACTTTTGTACT 

TTTTTTCTTCTCTACTTAACCTCTATTATTGTAATGCC^GTC^U^GTCCTTATCTAGCTA 

GTACATGAGTTTCTGTTTTCACTGGTTAAGCCAT 

>G580 Amino Acid Sequence (domain in AA coordinates : 162-218) 
MLSSAIO^IimHSAFSISSSSSSLSTSSSLGHNK^ 

GHEPMLKNQNPNNSIFQDFLNMPLNQPPPPPPPPSSSTIVTALYGSLPLPPPATVLSLNS 
GVGFEFLDTTENLLASNPRS FEES AKFGCLGKKRGQDSDDTRGDRRYKRMI KNRESAARS 
RARKQAYTNELELE IAHLQTENARLKIQQEQLKIAEATQNQVKKTLQRS S TAPF * 
>G615 (197.. 1252) 

TTTTTTCTTTTCTTTCTTTTTTTGCTGGTGTGAGAAATTGTACGCTTACTATCTCTCTCT 
CTCTCTGCCAGATTCTCTCTTTTTGATGATGTGAAAGTTGTGCTTTTGTTTCTTAAGAAA 
T^AGGCATATTTTTAATACTTGATTCTTGGTTCTTGATTCTTGATTCTTGGTTTTTTTTAG 
CTTCTTAAGTTCGGTGATGTCGTCTTCCACCAATGACTACAACGATGGTAATAACAATGG 
AGTGTACCCTCTCTCTCTTTACCTTTCTTCACTCTCTGGCCATGAAGACATCATTCATAA 
TCCCTACAACCATCAGTTAAAAGCATCTCCGGGCCATATGGTATCAGCAGTTCCTGAATC 
TCTGATCGATTACATGGCGTTTAAGTCAAATAATGTTGTGAATCAACAAGGCTTTGAGTT 
TCCTGAGGTGTCAAAGGAAATCAAGAAGGTGGTGAAGAAGGACCGACATAGCAAGATTCA 
AACGGCACAAGGGATTAGAGACAGGAGGGTTAGGCTTTTTATTGGGATTGCTCGCCAATT 
CTTTGATCTTCAGGATATGTTGGGGTTTGATAAAGCTAGTAAAACGTTAGACTGGCTGCT 
CAAGAAGTCAAGAAAAGCCATCAAAGAGGTCGTACAAGCAAAAAACCTCAACAATGATGA 
TGAAGATTTTGGAAACATTGGAGGCGATGTAGAACAAGAAGAGGAGAAGGAGGAGGATGA 
CAATGGCGATAAGAGCTTCGTGTATGGTTTGAGCCCCGGGTACGGTGAAGAAGAAGTGGT 
ATGTGAGGCCACGAAGGCAGGGATAAGAAAGAAGAAGAGTGAGTTGAGAAACATCTCATC 
AAAGGGGCTAGGAGCGAAAGCTAGAGGAAAAGCAAAGGAGCGAACAAAAGAGATGATGGC 
CTATGATAATCCAGAGACTGCCTCTGATATTACACAATCTGAAATCATGGACCCATTCAA 
GAGGTCTATAGTCTTCAATGAAGGAGAAGATATGACACACCTTTTCTACAAGGAACCAAT 
CGAGGAGTTTGATAATCAAGAATCTATCTTAACCAATATGACTCTACCAACGAAGATGGG 
TCAAAGTTACAATCAAAATAATGGGATACTTATGTTGGTAGATCAGAGTTCTAGCAGCAA 
CTATAATACATTTCTGCCTCAAAATTTGGATTATAGTTATGATCAAAACCCTTTTCATGA 
CCAAACCTTATATGTAGTCACCGACAAAAATTTCCCCAAAGGTTTCCTATAAATCTCGAC 
AGTTTTGAAGGACTATGCATGATCAAGTTTAAACATGTAAGCCAATATAGTCCCTTATTC 
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CTCTGAATGTATACAAAATCTATAGTTATGTATATCTGTTCCTTTTTAACGTATCTTTAT 

TGATCTTCTGTGCCTTGATCAAAATTGTCAT^ 

CTACAACTTTTAAGTGGTATTATTGTAACC^ 

GAACATGTTTATATAAAAA 

>G615 Amino Acid Sequence (domain in AA coordinates : 88-147} 
MSSSTNDYNIX3N1WGVYPLSLYLSSLSGHQDIIHNPYNHQL 

APKSNNVVNQQGFEFPEVSKEIKK\AnCKDRHSKIQTAQGIRDRRVRLFIGIARQFFDLQD 
MLGFDKASKTLDWLLKKSRKAIKEWQAKNLNNDDm 

FWGIiSPGYGEEEVVCEATKAGIRKKKSELIWISSKGLGAKARGKAKERTKEMMAYDNPE 
TASDITQSEir©PFKRSIVFNEGEDMTHLFYKEPIEEFDNQESILTNMTLPTKMGQSYNQ 
NNGILMLVDQSSSSNYNTFLPQNLDYSYDQNPFHDQTLYVVTDKNFPKGFL* 
>G732 (73.. 588) 

AAAAAAACCAAACATAAAACATAAAACTCTGTCCTTTTTTTGT^ 

TGTTAAAAATCAATGGCGTCATCTAGCAGCACATACCGGAGCTCAAGCTCTTCCGACGGT 
GGTAATAATAACCCGTCGGACTCCGTCGTCACCGTCGACGAACGAAAACGTAAAAGAATG 
TTATCGAACAGAGAATCTGCACGTAGGTCAAGGATGCGTAAACAGAAACACGTTGATGAT 
CTAACGGCTCAGATC^TCAGCTATCAAACGAC^CCGTCAGATCTTGAACAGCCTCACC 
GTAAGATCTCAGCTTTACATGAAGATCCAAGCCGAGAACT^ 

GAGGAGCTTAGCACCAGACTCCAATCTCTCAACGAGATCGTTGATCTTGTTCAATCCAAC 

GGTGCAGGATTTGGTGTTGAC(^GATCGACGGCTGTGGTTTTGATGATCGTACGGTTGGG 

ATCGACXMATATTACGATGATATG^TATGATGAGTAATGTTAATCATTGGGGTGGTTCG 

GTTTACACTAACCAACCCATTATGGCTAATGATATCAATATGTATTGATTAATAAAATTA 

ATTAAAATAATTAGATGCCCCTTTTTTGTCTTTOTATTTTAAAATTTAGCCCATTTTGGT 

GTTTTTGGGTTGGTGTGATGATGTAATTATAGTACATGCATCTTTGATTGGTTGGAAGGA 

TAZ^TATAAACTTTATATATATATTGGGGCATATATATATGAGTTGTACTTTGCATGTAT 

TGGTGTGTGTTTTGTTATAATTATATGATTATATATGTTTATGTTAAAAAAAAA 

>G732 Amino Acid Sequence (domain in AA coordinates: 31-91) 

MASSSSTYRSSSSSDGGNNNPSDSVVTVDERKRKRML 

INQLSITONRQILNSLTVTSQLYMKIQAENSVLTAQI^ELSTRLQSI^ 

GVDQIDGCGFDDRWGIDGYYDDMNI^ 

>G988 (1...1338) 

ATGCTTACTTCCTTCAAATCCTCTAGCTCCTCCTCCGAAGATGCCACCGCTACCACCACC 
GAGAATCCTCCTCCTTTGTGCATCGCCTCCTCCTCGGCCGCAACCTCCGCCTCACATCAC 
CTCCGTCGTCTTCTTTTC^CCGCTGCGAATTTCGTCTCCCAGTCAAACTTCACCGCCGCT 
CAAAACTTACTCTCAATCCTCTCCCTTAACTCTTCTCCTCACGGCGACTCCACCGAGCGA 
CTTGTACACCTCTTCACTAAAGCCTTGTCCGTACGAATCAACCGTCAGCAACAAGATCAG 
ACGGCTGAAACGGTTGCCACGTGGACGACGAACGAAATGACGATGAGTAACTCCACGGTG 
TTCACGAGCAGTGTATGCAAAGAACAGTTCTTGTTTCGAACCAAGAACAACAATTCTGAC 
TTCGAGTCTTGTTACTATCTTTGGCTAAACCAACTAACGCCGTTTATTCGGTTCGGTCAT 
TTAACGGCGAACCAAGCTATCCTCGACGCGACGGAGACAAACGATAACGGAGCTCTACAT 
ATACTTGATTTAGATATATCACAAGGACTTCAATGGCCTCCATTGATGCAAGCCCTAGCA 
GAGAGGTCATCAAACCCTAGCAGTCCACCTCCATCTCTCCGCATAACCGGATGCGGTCGA 
GATGTAACCGGATTAAACCGAACTGGAGACCGGTTAACCCGGTTCGCTGACTCTTTAGGT 
CTCCAATTCCAGTTTCACACGCTAGTGATCGTAGAAGAAGATCTCGCCGGACTTTTGCTA 
CAGATCCGATTGTTAGCTCTCTCAGCCGTACAAGGAGAGACCATTGCCGTCAATTGTGTT 
CACTTCCTCCACAAAATATTTAACGACGATGGAGATATGATCGGTCACTTCTTGTCAGCG 
ATCAAGAGCTTAAACTCTAGAATCGTTACAATGGCAGAGAGAGAAGCTAATCATGGAGAT 
CACTCGTTCTTGAAXAGATTCTCTGAGGCAGTGGATCATTACATGGCGATCTTTGATTCG 
TTGGAAGCGACGTTGCCGCCAAATAGCCGAGAGAGACTAACCCTAGAGCAACGGTGGTTC 
GGTAAGGAGATTTTGGATGTTGTGGCGGCGGAAGAGACGGAGAGAAAGCAAAGACATCGG 
AGGTTTGAGATTTGGGAAGAGATGATGAAGAGGTTTGGTTTCGTTAACGTTCCTATTGGA 
AGCTTTGCTTTGTCTCAAGCTAAGCTTCTTCTTAGACTTCATTATCCTTCAGAAGGTTAT 
AATCTTCAGTTCCTTAACAATTCTTTGTTTCTTGGCTGGCAAAATCGTCCCCTCTTCTCC 
GTTTCGTCGTGGAAATGA 

>G988 Amino Acid Sequence (domain in AA coordinates: 178-195) 
MLTSFKSSSSSSEDATATTTENPPPLCIASSSAATSASHHLRRLLFTAANFVSQSNFTAA 
QNLLSILSLNSSPHGDSTERLVHLFTKALSVRINRQQQDQ 
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FTSSVCKEQFLFRTKNNWSDFESCYYLWLNQLTPFIRFGHLTANQAILDATETNDNGALH 

ILDLDISQGLQWPPLMQALAERSSNPSSPPPSLRITGCGRDVTGLNRTGDRLTRFADSLG 

LQFQFHTLVI VEEDLAGLLLQIRLLALSAVQGETI AVNCVHFLHKI FNDDGDM IGHFLS A 

I KSLNSRI VTMAEREANHGDHSFLNRFSEAVDHYMAI FDSLEATLPPNSRERLTLEQRWF 

GKEILDWAAEETERKQRHRRFEIWEEMMKKFGFVNVPIGSFALSQAKLLLRLH^ 

NLQFLNNSLFLGWQNRPLFSVSSWK* 

>G1519 (1..1146) 

ATGAGGCTTAATGGGGATTCGGGTCCGGGTCAGGATGAACCCGGTTCGAGCGGGTTTCAC 
GGCGGAATCAGACGATTCCCGTTAGCAGCTCAGCCGGAGATTATGAGAGCTGCTGAGAAA 
GACGATCAATACGCTTCTTTCATCCACGAAGCTTGCCGCGATGCCTTCCGACACCTTTTC 
GGTACAAGAATCGCTCTTGCTTACCAGAAGGAGATGAAGCTACTTGGACAGATGCTTTAC 
TATGTTCTTACGACAGGTTCAGGGCAACAAACTTTAGGAGAGGAATATTGTGACATTATA 
CAGGTTGC^GGGCCTTATGGACTCTCTCCTA(^CCAGCTAGACGTGCTTTGTT<^TATTG 
TACCAGACCGCAGTTCCATATATCGCAGAGAGAATTAGCACTCGAGCTGCTACGCAAGCA 
GTCACCTTTGATGAGTCTGATGAGTTTTTTGGTC 

ATAGATCTTCCATCTTCATCTCAAGTTGAAACTTCAACTTCTGTAGTATCTAGGTTAAAC 
GATAGACTTATGAGATCGTGGCACCGAGCTATTCAGCGATGGCCTGTGGTTCTTCCTGTT 
GCCCGCGAAGTCITACAACTGGTTTTGCGTGCCAATCTGAT^ 

TTTTATTATCATATATCGAAACGTGCATCCGGGGTTCGTTATGTTTTCATAGGAAAGCAA 
CTGAATCAGAGACCTAGATACCAAATTCTTGGGGTTTTCCTTCTAATCCAATTGTGCATC 
CTTGCTGCTGAGGGCTTGCGTCGGAGTAATTTGTCATCTATC^CTAGCTCCATTCAGCAG 
GCTTCTATAGGATCTTATCAAACTTCAGGAGGGAGAGGTTTACCTGTTTTAAATGAAGAG 
GGGAATTTGATAACTTCGGAAGCTGAAAAGGGAAACTGGTCTACCTCCGATTCAACTTCA 
ACGGAGGCAGTAGGGAAATGCACTCTCTGCTTAAGCACCCGTCAGCACCCAACGGCCACT 
CCTTGTGGTCATGTGTTTTGTTGGAGCTGCATTATGGAATGGTGC7UVCGAGAAGCAAGAA 
TGCCCTCTTTGTCGAACGCCCAATACCCATTCAAGTTTGGTTTGTTTGTATCATTCTGAT 
TTTTAG 

>G1519 Amino Acid Sequence (domain in AA coordinates: 327-364) 

MRLNGDSGPGQDEPGSSGFHGGIRRFPLAAQPEIMRAAEKDDQYASFIHEACRDAFRHLF 

GTRIALAYQKEMKLLGQMLyYVLTTGSGQQTLGEEYCDIIQVAGPYGLSPTPARRALFIIi 

YQTAVPYIAERISTRAATQAVTFDESDEFFGDSHIHSPRMIDLPSSSQVETSTSWSRLN 

DRLMRS WHRAI QRWPVVLPVAREVLQLVLRANLMLF YFEGFYYHI S KRASGVRYVFIGKQ 

LNQRPRYQILGVFLLIQLCILAAEGLRRSNLSSITSSIQQASIGSYQTSGGRGLPVLNEE 

GNLITSEAEKGNWSTSDSTSTEAVGKCTLCLSTRQHPTATPCGHVFCWSCIMEWCNEKQE 

CPLCRTPNTHSSLVCLYHSDF* 

>G374 (1..1359) 

ATGGACAACAAAAATGATCAGGATATTGATGTTAGATCAGTGGTTGAAGCTGTTTCCGCC 
GATCTTTCCTTTGGTGCTCCCCTCTATGTGGTTGAGAGCATGTGCATGCGCTGCCAAGAA 
AATGGAAC7VACCAGATTTCTATTGACCTTAATTCCTCACTTCAGAAAGGTCTTAATATCT 
GCATTTGAATGTCCGCATTGCGGGGAAAGGAATAATGAAGTTCAGTTCGCAGGCGAGATT 
CAACCCCGTGGATGCTGTTACAATCTAGAGGTTCTAGCTGGTGATGTGAAGATATTTGAC 
CGGCAAGTTGTGAAATCTGAATCAGCCACTATTAAGATTCCTGAACTGGATTTTGAGATT 
CCACCAGAGGCCCAACGTGGAAGTTTGTCTACTGTGGAAGGGATATTAGCACGGGCTGCT 
GATGAACTGAGTGCCCTTCAAGAAGAACGCAAGAAAGTTGATCCTAAAACTGCTGAAGCA 
ATAGACCAATTCTTGTCCAAACTGAGAGCTTGTGCTAAAGCAGAGACATCCTTCACCTTC 
ATTTTGGATGATCCTGCTGGAAACAGTTTCATTGAGAACCCACATGCTCCATCACCAGAT 
CCCTCTCTAACCATCAAATTCTATGAGCGAACACCAGAGCAACAAGCAACACTTGGATAT 
GTTGCTAACCCATCTCAGGCTGGACAATCAGAAGGAAGCCTTGGCGCACCTGTGATGACT 
TTCCCTTCAACTTGCGGAGCATGTACGGAGCCGTGTGAGACACGGATGTTCAAAATAGAA 
ATCCCGTACTTTCAGGAAGTTATTGTCATGGCATCTACATGTGACAGTTGTGGCTATCGT 
AATTCTGAGTTGAAGCCTGGTGGTGCAATTCCTGAAAAGGGAAAGAAGATTACTCTCTCT 
GTGAGGAACATTACAGACCTTAGCCGAGATGTTATCAAGTCGGACACTGCAGGAGTGATA 
ATCCCAGAACTTGATCTGGAGCTAGCTGGTGGTACACTTGGTGGAATGGTAACAACAGTT 
GAAGGGTTGGTTACACAGATCAGAGAAAGCCTAGCGAGAGTTCACGGATTCACTTTTGGT 
GATAGTATGGAAGAGAGTAAGTTGAACAAATGGAGAGAATTTGGAGCCAGGCTCACTAAG 
CTCCTAAGCTTTGAACAGCCGTGGACATTGATTCTTGATGATGAATTAGCAAATTCCTTT 
ATTGCACCAGTAACAGATGATATCAAAGATGACCATCAGCTCACATTTGAAGAGTACGAG 



13 



WO 03/013227 



PCT/US02/25805 



AGGTCATGGGATCAAAACGAGGAGTTGGGTCTCAACGACATAGATACTTCTTCAGCTGAT 
GCTGCTTATGAATCCACAGAGACGACTAAATTACCTTAA 

>G374 Amino Acid Sequence {domain in aa coordinates: 35-67, 245-277) 
I^NKNDQDIDVRSWEAVSADLSFGAPLYVVESMC^ 

AFECPHCGERNNEVQFAGEIQPRGCCYNLEVLAGDVKIFDRQVVKSESATIKIPEIiDFEI 

PPEAQRGSLSTVEGILARAADELSALQEERKKVDPKTAEAIDQFLSKLRACAKAE 

ILDDPAGNSFIENPHAPSPDPSLTIKFYERTPEQQATLGYVANPSQAGQSEGSLGAPVMT 

FPSTCGACTEPCETRMFKIEIPYFQEVIVMASTCDSCGYRNSELKPGGAIPEKGKKITLS 

VRNITDLSRDVIKSDTAGVIIPELDLELAGGTLGGMVTTVEGLVTQIRESLARVHGFTFG 

DSMEESKI^KHREFGARLTKLLSFEQPWTLILDDELANSFIAPVTDDIKDDHQLTFEEYE 

RSWDQNEELGLNDIDTSSADAAYESTETTKLP* 

>G877 (397.. 2460) 

CAAAGATTAGACTAATCCGACTGTGTTTTTAATCAATCATCATTTTATTTAGGGGAGAGA 
AGTTGTAAAGTTTTGATTTTTTTTTCTGGGTTTTTTCTC 

AGAGAGGAAGAAGGAGAAGAAAAAAATATCTCTTTCTCTCCGGCTTTCAAGAAAATCT 
CTTTTTTCCTTCATCAGTGTTAAATTCGGATCCGGGTCGGGTGGGTTTTCGGTTTTTGGT 
GTTCGGATGAGAGCACAGTTGGATGTTAGCGACGGAACTGAGGATTTCAGTTTGCGG 
CGGCGGCTGTGACGGTGTTTGTGTGTCGTCTTC 

TTTGATCAGAGATTC^GCCAAATTCTTGGATACTAAATGGCTGGTTTTGATGAAAATGTT 

GCTGTGATGGGAGAATGGGTGCCTCGTAGTCCTAGTCCCGGGACACTTTTCTCCTCTGCT 

ATTGGAGAAGAGAAGAGCTCGAAACGTGTTCTTGAAAGAGAGTTATCTTTGAATCATGGT 

CAAGTTATTGGTTTAGAAGAAGACACTAGTAGTAATCATAACAAGGATTCTTCACAAAGC 

AATGTTTTTCGAGGTGGTCTCAGTGAAAGAATTGCTGCAAGAGCTGGATTTAATGCTCCA 

AGGTTGAACACTGAGAATATCCGCACCAACACCGACTTTTCCATTGACTCTAACCTTCGA 

TCTCCTTGCTTAACCUVTCTCTTCTCCTGGCCTTAGCCCTGCAACACTCTTGGAATCTCCT 

GTTTTCCTTTCTAACCCATTGGCTCAACCTTCTCCAACTACCGGGAAATTTCCATTTCTT 

CCTGGTGTTAATGGTAATGCATTGTCTTCTGAGAAAGCGAAAGACGAGTTCTTTGATGAT 

ATTGGAGCATCATTCAGCTTCCATCCTGTTTCAAGATCATCTTCCTCTTTCTTCCAAGGC 

ACAACAGAGATGATGTCAGTTGATTATGGTAACTACAACAATAGATCTTCTTCTCATC^ 

TCCGCAGAAGAAGTAAAACCTGGCTCTGAAAACATAGAAAGCTCCAATCTTTATGGGATT 

GAAACTGACAATCAAAACGGGCAGAACAAGACATCTGATGTCACTACAAACACCAGTCTT 

GAAACCGTGGATCATCAAGAGGAAGAAGAAGAGCAAAGACGCGGTGATTCGATGGCTGGT 

GGTGCGCCTGCAGAGGATGGATATAACTGGAGGAAATACGGACAAAAGTTGGTCAAAGGA 

AGTGAGTATCCGCGAAGCTATTACAAGTGC^CAAACCCGAATTGTCAGGTGAAGAAGAAA 

GTTGAGAGATCAAGGGAAGGTCACATCACAGAGATTATATACAAAGGAGCTC^ 

CTTAAACCTCCACCTAATCGCCGCTCAGGGATGCAAGTAGATGGAACTGAACAAGTTGAA 

CAACAACAACAACAGAGAGATTCTGCTGCAACGTGGGTTAGTTGTAATAACACTCAACAA 

CAAGGTGGAAGCAATGAGAACAATGTCGAAGAGGGATCTACGAGATTCGAGTATGGAAAC 

CAATCTGGATCAATTC^GCTCAAACCGGAGGTCAATACGAGTC^GGTGATCCTGTGGTT 

GTGGTTGATGCTTCTTCAACATTCTCTAATGATGAAGATGAAGATGATCGAGGGACACAT 

GGAAGTGTTTCTTTGGGTTACGATGGAGGAGGAGGAGGTGGGGGAGGAGAAGGAGATGAA 

TCAGAGTCGAAAAGAAGGAAACTAGAAGCTTTTGCAGCAGAGATGAGTGGATCAACAAGA 

GCCATACGTGAGCCAAGAGTTGTTGTGCAGACAACGAGTGATGTTGACATTCTTGATGAT 

GGTTATCGCTGGCGAAAATATGGTCAGAAAGTTGTCAAAGGC^^TCCAAATCCAAGGAGT 

TATTACAAATGCACAGCTCCAGGATGTACAGTGAGGAAACATGTTGAAAGAGCTTCTCAT 

GATCTCAAATCCGTTATAACAACTTACGAAGGCAAACATAACCATGACGTCCCCGCTGCA 

CGCAACAGCAGCCACGGAGGCGGTGGTGATAGTGGTAACGGTAACAGCGGCGGTTCAGCC 

GCAGTTTCTCACCA^ACCACAACGGTCATCACTCAGAGCCGCCACGTGGGAGATTCGAC 

AGACAAGTCACAACTAACAATCAGTCTCCTTTTAGCCGTCCCTTTAGCTTTCAGCCACAT 

TTGGGTCCTCCTTCTGGTTTCTCCTTCGGTTTAGGACAAACCGGTTTGGTTAATCTTTCA 

ATGCCTGGTTTAGCGTATGGTCAAGGGAAAATGCCGGGTTTGCCTCACCCGTATATGACA 

CAACCGGTTGGGATGAGTGAAGCAATGATGCAGAGAGGGATGGAACC?UVAGGTTGAACCG 

GTTTCAGATTCAGGACAATCGGTATATAACCAGATCATGAGTAGATTACCTCAGATTTGA 

AATTTACTCTTCTTCTTCTTCTTCTGCATTTGGTCACTCCTTATAATAACTTTTAATTTC 

TGCTTCTTCTTCTTCTTTCATTTATTGGTTTCAAACTTTGGGGAAGGTAAAGGCTGTTTT 

ATTGTTAAAAAAAAAAAAAAAAA 

>G877 Amino Acid Sequence (domain in AA coordinates: 272-32 8, 487-603) 
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MAGFDEKVAVMGEWVPRS PS PGTLFSSAI GEEKS SKRVLERELSLNHGQVIGLEEDTSSN 

HNKDSSQSNVFRGGLSERIAARAGFNAPRLNTENIRTNTDFSIDSNLRSPCLTISSPGLS 

PATLLESPVFLSNPLAQPSPTTGKFPFLPGVNGNALSSEKAKDEFFDDIGASFSFHPVSR 

SSSSFFQGTTEMMSVDYGNYl^SSSHQSAEEVKPGSENIESSNLYGIETDNQNGQNKTS 

DVTTNTSLETVDHQEEEEEQRRGDSMAGGAPAEDGYNWRKYGQKLVKGSEYPRSYYKCTN 

PNCQVKKKVERSREGHITEIIYKGAHNHLKPPPNRRSGMQVDGTEQVEQQQQQRDSAATW 

VSC^TQQQGGSNENNVEEGSTRFEYGNQSGSIQAQTGGQYESGDPVVVVDASSTFSNDE 

DEDDRGTHGSVSLGYDGGGGGGGGEGDESESKRRKLEAFAAEMSGSTRAIREPRVWQTT 

SDVDILDDGYRWRKYGQKVVKGNPNPRSYYKCTAPGCTVRKHVERASHDLKSVITTYEGK 

HNHDVPAARNSSHGGGGDSGNGNSGGSAAVSHHYHNGHHSEPPRGRTO 

RPFSFQPHLGPPSGFSFGLGQTGLVNLSMPGLAYGQGKMPGLPHPYMTQPVGMSEAMMQR 

GMEPKVEPVSDSGQSVYNQIMSRLPQI * 
>G1000 (1..954) 

ATGGGAAGACCTCCTTGTTGTGACAAGTCCAATGTCAAGAAAGGTCTCTGGACCGAGGAA 
GAAGACGCTAAGATCCTTGCTTATGTTGCTATCCATGGTGTAGGAAACTGGAGCTTGATC 
CCCAAAAAAGCAGGTCTGAATCGATGTGGAAAGAGCTGTAGACTAAGATGGACTAATTAC 
TTAAGACCTGACCTTAAACATGACAGCTTCTCTACCCAAGAAGAAGAGCTTATCATTGAG 
TGTCATAGAGCCATTGGCAGCAGGTGGTCTTCCATTGCACGAAAGCTTCCAGGAAGAACG 
GATAATGATGTGAAGAATCACTGGAACACAAAGCTGAAGAAGAAGCTGATGAAAATGGGG 
ATAGACCCGGTGACTCATAAACCGGTTTCTCAACTCCrTGCAGAATTCAGAAACATTAGC 
GGCCATGGAAATGCATCCTTCAAAACAGAACCATCTAACAACTCTATACTCACACAATCC 
AACTCAGCTTGGGAAATGATGAGAAACAGAACAAGAAAC 

TCTCCAATGATGTTTACAAATTCCTCTGAGTACCAAACTACTCCATTTCATTTCTATAGC 
(^TCO&AATCATCTGCTCAATGGAAC^ 

AGTATCACTCAGCCAAACCAAGTACCTCAAACACCGGTTACTAACTTCTACTGGAGCGAT 

TTCCTTCTCTCGGACCCGGTTCCTCAAGTAGTGGGATCCTCAGCTACTAGCGACCTCACT 

TTTACGCAGAACGAACATCATTTC^^CATCGAAGCCGAATACATCTCTCAAAACATCGAT 

TCAAAGGCCTCGGGAAC^TGTCATTCCGCGAGTTCCTTCGTTGACGAAATACTAGATAAA 

GACCAAGAGATGTTGTCACAGTTTCCTCAACTCTTGAATGATTTCGATTATTAG 

>G1000 Amino Acid Sequence (domain in AA coordinates: 14-117) 

MGRPPCCDKSNVKKGLWTEEEDAKI LAYVAIHGVGNWSLI PKKAGLNRCGKS CRLRWTNY 

LRPDLKHDSFSTQEEELIIECHRAIGSRWSSIARKLPGRTDNDVKNHWNTKL 

IDPVTHKPVSQLIJ^FRNISGHGNASFKTEPSNNSIL^^ 

SPMMFTNSSEYQTTPFHFYSHPNHLLNGTTSSCSSSSSSTSITQPNQVPQTPVTNFYWSD 
FLLSDPVPQWGSSATSDLTFTQNEHHFNIEAEYISQNIDSKASGTCHSASSFVDEILDK 
DQEMLSQFPQLLNDFDY* 
>G1067 (436.. 1371) 

TCTCAAGCTTCTCTCTCCTTTTTTTCCCATAGCACATCAGAATCGCTAAATACGACTCCT 
ATGCAAAGAAGAAGCTACTTCTTTCTCTTGCCCTAATTAATCTACCTAACTAGGGTTTCC 
TCTTACCTTTCATGAGAGAGATCATTTA^ 

GTCTTTAATTTAGTTCTGTTCTTGGTCTGTTTCTATATTTTGTCGGCTTGCGTAACCGAT 

CACACCTTAATGCTTTAGCTATTGTTTCCTCAAAATCATGAGTTTTGACTTCTCGATCTG 

AGTTTTCTTTTTCTCTCTTTACGCTCTTCTTCACCTAGCTACCAATATATGAACGAGCAG 

GATCAAGAATCGAGAAATTGATTTGAGCTGGCGAATAAGCAGTGGTGGGATAGGGAATTA 

GTAGATGCGGCGGCGATGGAAGGCGGTTACGAGCAAGGCGGTGGAGCTTCTAGATACTTC 

CATAACCTCTTTAGACCGGAGATT<^CCACCAACAGCTTCAACCGCAGGGCGGGATCAAT 

CTTATCGACCAGCATCATC^TC^GC^C^^ 

GATTCAAGAGAATCTGACCATTCAAACAAAGATCM 

GACCCGAATACATCAAGCTCAGCACCGGGAAAACGTCCACGTGGACGTCCACCAGGATCT 
AAGAACAAAGCCAAGCCACCGATCATAGTAACTCGTGATAGCCCCAACGCGCTTAGATCT 
CACGTTCTTGAAGTATCTCCTGGAGCTGACATAGTTGAGAGTGTTTCCACGTACGCTAGG 
AGGAGAGGGAGAGGCGTCTCCGTTTTAGGAGGAAACGGCACCGTATCTAACGTCACTCTC 
CGTCAGCCAGTCACTCCTGGAAATGGCGGTGGTGTGTCCGGAGGAGGAGGAGTTGTGACT 
TTACATGGAAGGTTTGAGATTCTTTCGCTAACGGGGACTGTTTTGCCACCTCCTGCACCG 
CCTGGTGCCGGTGGTTTGTCTATATTTTTAGCCGGAGGGCAAGGTCAGGTGGTCGGAGGA 
AGCGTTGTGGCTCCCCTTATTGCATCAGCTCCGGTTATACTAATGGCGGCTTCGTTCTCA 
AATGCGGTTTTCGAGAGACTACCGATTGAGGAGGAGGAAGAAGAAGGTGGTGGTGGCGGA 
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GGAGGAGGAGGAGGAGGGCCACCGCAGATGCAACAAGCTCCATCAGCATCTCCGCCGTCT 

GGAGTGACCGGTCAGGGACAGTTAGGAGGTAATGTGGGTGGTTATGGGTTTTCTGGTGAT 

CCTCATTTGCTTGGATGGGGAGCTGGAACACCTTCAAGACCACCTTTTTAATTGAATTTT 

AATGTCCGGAAATTTATGTGTTTTTATCATCTTGAGGAGTCGTCTTTCCTTTC 

TGGTGTTTAATGTTTAGTTGATATGCATATTTT 

>G1067 Amino Acid Sequence (domain in AA coordinates: 86-93) 
MEGGYEQGGGASRYFHNLFRPEIHHQQLQPQGGINIilDQHHHQHQQHQQQQQPSDDSRES 
DHSNKDHHQQGRPDSDPNTSSSAPGKRPRGRPPGSKNKAKPPIIVTRDSPNALRSHVLEV 
S PGAD I VESVSTYARRRGRGVS VLGGNGTVSNVTIjRQPVTPGNGGGVSGGGGVVTLHGRF 
EILSLTGTVLPPPAPPGAGGLSIFLAGGQGQWGGSWAPLIASAPVILMAASFSNAVFE 
RLPIEEEEEEGGGGGGGGGGGPPQMQQAPSASPPSGVTGQGQLGGNVGGYGFSGDPHLLG 

WGAGTPSRPPF* 
>G1075 (19.. 876) 

TTTGTGTTTGGTGCTGGCATGGCTGGTCTCGATCTAGGCACAACTTCTCGCTACGTCCAC 
AACGTCGATGGTGGCGGCGGCGGACAGTTCACCACCGACAACCACCACGAAGATGACGGT 
GGCGCTGGAGGAAAC(^CCATCATCACCATCATAATCATAATCACCATCAAGGTTTAGA^ 
TTAATAGCTTCTAATGATAACTCTGGACTAGGCGGCGGTGGAGGAGGAGGGAGCGGTGAC 
CTCGTCATGCGTCGGCCACGTGGCCGTCCAGCTGGATCGAAGAACAAACCGAAGCCGCCG 
GTGATTGTC^CGCGCGAGAGCGCAAACACTCTTAGGGCTCACATTCrrTGAAGTTGGAAGT 
GGCTGCGACGTTTTCGAATGTATCTCCACTTACGCTCGTCGGAGACAGCGCGGGATTTGC 
GTTTTATCCGGGACGGGAACCGTCACTAACGTCAGCATCCGTCAGCCTACGGCGGCCGGA 
GCTGTTGTGACTCTGCGGGGTACTTTTGAGATTCTTTCCCTCTCCGGATCTTTTCTTCCG 
CCACCTGCTCCTCCAGGGGCGACTAGCTTGACGATATTCCTCGCTGGAGCTCAAGGACAG 
GTCGTCGGAGGTAACGTAGTTGGTGAGTTAATGGCGGCGGGGCCGGTAATGGTCATGGCA 
GCGTCTTTTACAAACGTGGCTTACGAAAGGTTGCCTTTGGACGAGCATGAGGAGCACTTG 
CAAAGTGGCGGCGGCGGAGGTGGAGGGAATATGTACTCGGAAGCCACTGGCGGTGGCGGA 
GGGTTGCCTTTCTTTAATTTGCCGATGAGTATGCCTCAGATTGGAGTTGAAAGTTGGCAG 
GGGAATCACGCCGGCGCCGGTAGGGCTCCGTTTTAGCAATTTAAGAAACTTTAATTGTTT 
TTTCCACTTTTTTGTTTTTCTCCGAATTTTATGAAATTATGATTTAAGAAAAAAAACGAT 
ATTGTTCATGTATTGACCCTCTTACTGCATGGTTTCTTCTATTGGGTTAATTGGCTAGCT 
CATAAGAATTGTTTAATTTGGTTATTGTCATCAAATTTGCCCACATATAAAGCTTCTAGC 

AAAT 

>G1075 Amino Acid Sequence (domain in AA coordinates: 78-85) 
MAGLDLGTTSRYVHNVDGGGGGQFTTDNHHEDDG 

NSGLGGGGGGGSGDLVMRRPRGRPAGSKNKPKPPVIVTRESANTLRAHILEVGSGCDVFE 
CISTYARRRQRGICVLSGTGTVTNVSIRQPTAAGAVVTLRGTFEILSLSGSFLPPPAPPG 
ATSLTI FIoAGAQGQWGGNWGELMAAGPVMVMAAS FTNVAYERLPLDEHEEHLQSGGGG 
GGGNMYSEATGGGGGLPFFNLPMSMPQIGVESWQGNHAGAGRAPF* 
>G1266 (62.. 718) 

CAATCCACTAACGATCCCTAACCGAAAACAGAGTAGTCAAGAAACAGAGTATTTTTTCTA 
CATGGATCCATTTTTAATTCAGTCCCCATTCTCCGGCTTCTCACCGGAATATTCTATCGG 
ATCTTCTCCAGATTCTTTCTCATCCTCTTCTTCTAACAATTACTCTCTTCCCTTCAACGA 
GAACGACTCAGAGGAAATGTTTCTCTACGGTCTAATCGAGCAGTCCACGCAACAAACCTA 
TATTGACTCGGATAGTCAAGACCTTCCGATCAAATCCGTAAGCTCAAGAAAGTCAGAGAA 
GTCTTACAGAGGCGTAAGACGACGGCCATGGGGGAAATTCGCGGCGGAGATAAGAGATTC 
GACTAGAAACGGTATTAGGGTTTGGCTCGGGACGTTCGAAAGCGCGGAAGAGGCGGCTTT 
AGCCTACGATCAAGCTGCTTTCTCGATGAGAGGGTCCTCGGCGATTCTCAATTTTTCGGC 
GGAGAGAGTTCAAGAGTCGCTTTCGGAGATTAAATATACCTACGAGGATGGTTGTTCTCC 
GGTTGTGGCGTTGAAGAGGAAACACTCGATGAGACGGAGAATGACCAATAAGAAGACGAA 
AGATAGTGACTTTGATCACCGCTCCGTGAAGTTAGATAATGTAGTTGTCTTTGAGGATTT 
GGGAGAACAGTACCTTGAGGAGCTTTTGGGGTCTTCTGAAAATAGTGGGACTTGGTGAAA 
GATTAGGATTTGTATTAGGGACCTTAAGTTTGAAGTGGTTGATTAATTTTAACCCTAATA 
TGTTTTTTGTTTGCTTAAATATTTGATTCTATTGAGAAACATCGAAAACAGTTTGTATGT 

ACTTTTGTGATACTTGGCG 

>G1266 Amino Acid Sequence (domain in AA coordinates: 79-147) 

MDPFLIQSPFSGFSPEYSIGSSPDSFSSSSSNNYSLPFNENDSEEMFLYGLIEQSTQQTY 

IDSDSQDLPIKSVSSRKSEKSYRGVRRRPWGKFAAEIRDSTRNGIRVWLGTFESAEEAAL 
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AYDQAAFS^GSSAILNFSAERVQESLSEIKYTYEDGCSPWALKRKHSMRRRMTNKKTK 

DSDFDHRSVKLDNVWFEDLGEQYLEELLGSSENSGTW* 

>G1311 (41.. 757) 

AAGTATAATAACACAAAGAAACAGAGTAAAAGAAAGAAAAATGGATTTTAAGAAGGAAGA 
AACACTTCGTAGAGGGCCATGGCTCGAAGAAGAAGACGAACGGCTAGTGAAGGTCATTAG 
TCTTTTGGGAGAACGTCGTTGGGATTCTTTAGCAATA 

TAAGAGTTGCAGGCTAAGGTGGATGAACTATCTGAATCCGACTCTGAAGCGTGGACCGAT 
GAGTCAAGAAGAAGAGAGAATC^TCTTTCAGCTCCATGCTCTATGGGGTAACAAGTGGTC 
GAAGATTGCGAGAAGATTACCCGGTAGGACTGATAACGAGATAAAGAACTATTGGAGAAC 
TCATTATAGAAAGAAACAGGAAGCTCAAAACTATGGAAAGCT 

TACAGGAGAAGAATTGTTGCACAAGTATAAGGAAACAGAGATCACTAGGACAAAGACGAC 
GTCTCAAGAACATGGTTTTGTTGAAGTTGTGAGCATGGAAAGTGGTAAAGAAGCCAACGG 
TGGTGTTGGTGGAAGAGAAAGCTTCGGTGTTATGAAATCACCGTATGAAAATCGGATTTC 
GGATTGGATATCAGAGATTTCTACTGACCAGAGTGAAGCAAATCT 

CAGCAATAGCTGCAGTGAGAACAATATTAACATTGGTACTTGGTGGTTTCAAGAGACTAG 
GGACTTTGAGGAGTTTTCATGTTCTCTATGGTCATAATTCTAAAGTTGGTTTATTTACCT 
TTTAAAAAAAAAAAAAAAAA 

>G1311 Amino Acid Sequence (domain in AA coordinates: 11-112) 
MDFKKEETLRRGPWLEEEDERLVKVISLLGERRWDSL^ 

TLKRGPMSQEEER 1 1 FQLHALWGNKWSKI ARRLPGRTDNE I KNYWRTHYRKKQEAQNYGK 

LFEWRGNTGEELIiHKYIOETEITRTKTTSQEHGFVEWSMESGKEANGGVGGRES 

PYENRISDWISEISTDQSEANLSEDHSSNSCSENNINIGTWWFQETRDFEEFSCSLWS* 

>G1321 (72.. 803) 

GTTCTTGTATTGGTTTGGATCGGTATACTTAGTTGATTACGTAATTAAATAGATCGGCGT 

GAAGAAGAAAAATGATCATGTGCAGCCGAGGCCATTGGAGACCAGCTGAAGACGAGAAGC 

TCAAGGATCTTGTCGAACAATACGGTCCTCACAATTGGAACGCCATTGCTCTCAAGCTTC 

CTGGTCGCTCTGGTAAGAGTTGTAGATTGAGATGGTTTAATCAATTGGATCCAAGGATCA 

ACCGAAACCCTTTCACGGAAGAAGAAGAAGAAAGACTTTTAGCGGCTCATCGGATCCATG 

GGAACAGATGGTCCATCATCGCAAGGCTTTTCCCTGGAAGAACTGATAACGCCGTCAAGA 

ACCATTGGCACGTCATCATGGCTCGTCGCACACGCCAAACCTCTAAGCCTCGTCTTCTTC 

CCTCGACGACTTCGTCTTCTTCTTTAATGGCGAGTGAACAAATCATGATGAGTTCTGGTG 

GTTATAATCATAATTATAGTTCCGATGATCGGAAGAAAATATTTCCAGCAGACTTTATAA 

ATTTCCCTTACAAATTCTCTC^TATCAATCATCTTC^CTTCCTAAAGGAGTTTTTCCCCG 

GAAAGATCGQTTTAAGTCACAAAGCAAATCAGAGTAAGAAGCCTATGGAGTTCTACAATT 

TTCTACAAGTAAACACAGATTCAAACAAGAGCGAGATTATAGATCAAGATTCAGGTCAAA 

GCAAACGCAGTGACTCGGACACCAAACATGAAAGTCATGTTCCATTCTTCGACTTT^ 

CCGTTGGAAACTCTGCCTCCTAGGATTAGTTTTTTTGCAGTAACTCCTAAATTTCTAGAT 

TAACTATTTAGTCCGTATACGTACGAGATTATCTAGGTCGTTAGCATGTATGCTTGATGT 

GTATAATCACTAACTAGTGAGCTATTACCTGCGAAAATTGTAAGAAAAATACATAATGTT 

GATGTATCACACATTCTCAATGTCTGTAAAATTTCCATCGAGTTGTTAACTATCAAAGTT 

ATCCGTTTGAAAAAAAAAAAA 

>G1321 Amino Acid Sequence (domain in AA coordinates: 4-106) 
MIMCSRGHWRPAEDEKLKDLVEQYGPHNWNAIALKLPGRSGKSCRLRWFNQLDPRINRNP 
FTEEEEERLLAAHRIHGNRWS 1 1 ARLFPGRTDNAVKNHWHVIMARRTRQTSKPRLLPSTT 
SSSSLMASEQIMMSSGGYNHNYSSDDRKKIFPADFINFPYKFSHINHLHFLKEFFPGKIA 
LSHKANQSKKPMEFYNFLQVNTDSNKSEIIDQDSGQSKRSDSDTKHESHVPFFDFLSVGN 

SAS* 

>G1326 (32.-784-) 

CGACGGTACGGTGGAGATAGAGATAGCATCCATGGAGATGTCTAGAGGAAGCAACAGTTT 
TGACAATAAGAAGCCTAGTTGCCAAAGAGGTCACTGGAGACCTGTTGAAGATGACAATCT 
CCGGCAACTCGTTGAACAATACGGTCCCAAGAACTGGAATTTTATTGCTCAACATCTCTA 
TGGAAGATCAGGGAAAAGCTGTAGATTAAGATGGTACAACCAACTTGATCCAAACATCAC 
CAAGAAACCCTTCACCGAGGAGGAAGAAGAGAGACTGCTTAAAGCTCATCGGATCCAAGG 
GAATCGTTGGGCCTCCATAGCCCGACTGTTCCCCGGGAGGACCGACAACGCTGTCAAAAA 
CCATTTTCATGTCATCATGGCTAGACGCAAACGGGAAAACTTCTCTTCCACAGCTACTTC 
TACGTTCAACCAAACTTGGCATACTGTTTTGAGCCCTAGTTCTAGTCTTACAAGGCTAAA 
TAGATCCCATTTCGGGCTATGGAGGTATCGAAAGGATAAGAGTTGCGGTCTCTGGCCTTA 
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CTCTTTTGTTTCACCACCTACGAATGGTCAAT^ 

CCACGAAATTTATCTTGAGAGGAGAAAGTCGAAAGAGTTGGTGGATCCTCAGAATTACAC 

ATTTCATGCAGCCACACCAGATCATAAGATGACTTCAAATGAAGATGGACCATCCATGGG 

AGATGATGGTGAGAAGAACGATGTTACTTTCATTGATTTTCTTGGTGTTGGATTAGCTTC 

TTAGGTTATAACATCACAAGTCAAAGCTTTTAAGGGTTTCTATCATTA 

ATTTTCAGCCTTTTGCTTCCTTAAACTCTCATATGGATCT 

>G1326 Amino Acid Sequence (domain in AA coordinates: 18-121) 
MEMSRGSNSFDNKKPSCQRGHWRPVEDDNLRQLVEQYGPKNWNFIAQHLYGRSGKSCRLR 
WYNQLDPNITKKPFTEEEEERXiLKAHRIQGNRWASIARL 

RENFSSTATSTFNQTWHTVLSPSSSLTRLNRSHFGLWRYRKDKSCGLWPYSFVSPPTNGQ 

FGSSSVSNVHHBIYLERMCSKELVDPQNYTFHAATPDH^ 

IDFLGVGLAS* 
>G1367 (128. .1567) 

TCCTTCCACAAAACTTTTTTAATTTTATCTGAAAAATTAAAACAACCGAAACAAAAA 

AAAACTAAAAATCAAAAATCTCATCACCTTCCTTGCTCTGTATTTTTTCTCTCTCACTAA 

ATCCTCCATGGATCCTTCTCTCTCTGCAACCAATGATCCTCATCATCCTCCTCCTCCTCA 

GTTCACATCTTTCCCTCCTTTCACCAAC^^ 
ClTCACCGGACCraCCGCCGTCGCGC^^ 

TCCGCAGCAGCCAC^UUVCATCTCCAGTTCCTCCTCATCC^TCTATTTCCCACCCTCCTTA 
CTCTGACATGATTTG(^CGGCGATTGCAGCGTTAAACGAACCAGATGGGTCAAGCAAGCA 
AGCTATTTCGAGGTACATAGAGAGAATTTACACTGGGATTCCTACTGCTCATGGAGCTTT 
GTTGACACACCATCTCAAGACTTTGAAGACCAGTGGGATTCTTGTCATGGTTAAGAAATC 
TTACAAGCTTGCTTCTACTCCTCCTCCTCCTCCTCCTACTAGTGTAGCTCCTAGTCTTGA 
ACCTCCCAGATCTGATTTCATAGTCAACGAGAACCAACCTTTACCTGATCCGGTTTTGGC 
TTCTTCTACTCCTCAGACTATTAAACGTGGTCGTGGTCGACCTCCAAAAGCTAAACCAGA 
TGTTGTTCAACCTCAACCTCTGACTAATGGAAAACTCACCTGGGAACAGAGTGAATTACC 
TGTCTCTCGACCAGAGGAGATACAGATACAGCCGCCACAGTTACCGTTACAGCCACAGCA 
GCCGGTTAAGAGACCGCCGGGTCGTCCTAGAAAAGATGGAACTTCGCCGACGGTGAAGCC 
AGCTGCTTCTGTTTCCGGTGGTGTGGAGACTGTGAAACGAAGAGGTAGACCTCCGAGTGG 
AAGAGCTGCTGGGAGGGAGAGAAAGCCTATAGTAGTCTCAGCTCCAGCTTCAGTGTTCCC 
GTATGTTGCTAATGGTGGTGTTAGACGCCGAGGGAGACCAAAGAGAGTTGACGCTGGTGG 
TGCTTCCTCTGTTGCTCCACCACCACCACCACCAACTAACGTAGAGAGTGGAGGAGAGGA 
GGTTGCAGTCAAGAAACGAGGAAGAGGACGGCCTCCTAAGATTGGAGGTGTTATCAGGAA 
GCCTATGAAGCCGATGAGAAGCTTTGCTCGTACTGGAAAACCCGTAGGAAGACCCAGAAA 
GAATGCGGTGTCAGTGGGAGCTTCTGGACGACAAGATGGTGACTATGGAGAACTGAAGAA 
GAAGTTTGAGTTGTTTCAAGCGAGAGCTAAGGATATTGTAATTGTGTTGAAATCCGAGAT 
AGGAGGAAGTGGAAATCAAGCAGTGGTTCAAGCCATACAGGACCTGGAAGGGATAGCAGA 
GACAACAAACGAGCCAAAGCACATGGAAGAAGTGCAGCTGCCAGACGAGGAACACCTTGA 
AACCGAACCAGAAGCAGAGGGTCAAGGACAGACAGAAGCAGAGGCAATGCAAGAAGCTCT 
GTTCTAAAGATAAAGCCTTGACATAAAAAGCTAGCAAGTGGTGGGTTTACTTGTTGTGTG 
TTACATGAAATTTTTAATCTTATAAGGGTGTTTGCAGGAGAAAAACAAAAAGAACAATGT 
GATGAACTGATGATGATGATTGTGTCTCTAACCAAACAACAAGGAGAGGTAGGGTAATGT 
CTGTAAAGTGAATTAGGATGTTACC^TTGTTC^TGCTTCCCATCTCTCTCCATCGTCCAT 

TATTCTATTTTGTCTCCTTAGGCTTTTTAGGAGTTGTTGTTGATGTTTATCAAAAACGCT 
TATGTAATTTTTATGACCACTTCTACTTTTTATGATGGTTTCTT 

>G1367 Amino Acid Sequence (domain in AA coordinates: 179-201, 262-285, 298-319, 
335-357) _ 

MDPSLSATNDPHHPPPPQFTSFPPFTNTNPFASPNHPFFTGPTAVAPPNNIHLYQAAPPQ 
QPQTSPVPPHPSISHPPYSDMICTAIAALNEPDGSSKQAISRYIERIYTGIPTAHGALLT 
HHLKTLKTSGILVMVKKSYKl^STPPPPPPTSVAPSLEPPRSDFIWENQPLPDPVIiASS 
TPQTIKRGRGRPPKAKPDWQPQPLTNGKLTWEQSELPVSRPEEIQIQPPQLPLQPQQPV 
KRPPGRPRKDGTSPTVKPAASVSGGVETVKRRGRPPSGRAAGRERKPIVVSAPASVFPYV 
ANGGVRRRGRPKRVDAGGASSVAPPPPPPTNVESGGEEVAVKKRGRGRPPKIGGVIRKPM 
KPMRS FARTGKPVGRPRKNAVS VGASGRQDGDYGELKKKFELFQARAKDI VI VLKSE I GG 
SGNQAWQAIQDLEGIAETTNEPKHMEEVQLPDEEHLETEPEAEGQGQTEAEAMQEALF* 

>G1386 (89.. 673) 
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AATTTTATTTCCTTCTCTCAAATCTTCCCACCAAAAATTAACTCTTTCGTTCACACTAAG 
TCCCTTTTAAAAGAAAATATCCC^TTAATGGAACGTGACGACTGCCGGAGATTTCAGGA 
CTCGCCGGCGCAGACGACGGAGAGAAGAGTGAAATATAAACGAAAGAAGAAAAGAGCCAA 
AGATGATGATGATGAGAAAGTTGTTTCGAAGCATCCAAATT^ 

ACAATGGGGAAAATGGGTGTCCGAAATCAGAGAGCCAAAAAAGAAATCAAGAATCTGGCT 
CGGTACTTTCTCCACGGCGGAGATGGCGGCGCGTGCTCACGACGTGGCAGCTTTAGCCAT 
CAAAGGCGGTTCTGCACATCTCAACTTCCCGGAGCTCGCTTATCACCTCCCTAGACCAGC 
TAGTGCCGACCCTAAAGACATCCAAGCTGCCGCCGCCGCAGCTGCAGCCGCTGTGGCCAT 
TGACATGGATGTAGAGACGTCTTCGCCGTCGCCATCTCCCACAGTTACGGAAACGTCATC 
TCCGGCTATGATAGCACTCTCCGACGACGCGTTCTCCGATCTTCCTGATCTCTTGCTCAA 
CGTGAACCATAAC^TCGATGGCTTCTGGGACTCTTTTCCCTATGAAGAACCCTTCCTCTC 
TCAAAGTTACTAGAAACTCAAAACTATGTCGTTTTTC 

TTTTTTGACGTCGAAAATCACCCGGATAATCCAAATTGTATGATTTATTAATGGTTGAT^ 
ATTTTCTTTGTGTGGAACAATGTGTATGATACGTAATCAAAAGTTCAAAAAAAAAATAAA 

AAAAA 

>G1386 Amino Acid Sequence (domain in AA coordinates: TBD) 
ME RDDCRRFQD S PAQTTERRVKYKP KKKRAKDDDDE KWS KHPNFRG VRKRQ WGKWVS E I 
REPKKKSRIWLGTFSTAEMAARAHDVAAIAIKGGSAHI^ 

AAAAAAAAVAIDMD VETS S PS PS PTVTETS S PAMIALSDD AFSDLPDLLLNVNHNIDG F W 
DS FPYEEPFLSQSY* 
>G1421 (292.. 1155) 

GAAATTTCATCCCTAAATAAGAAAAAAGCATCTCCTTCTTTAGTGTCCTCCTTCACCAAA 
CTCTTGATTCCATAAGCATATATTAAAAAAGCTCTCTGCTTTCTTCAACTTTCCCGGGAA 
AATCTTCTTGTTAGAAAGCATCAATCTCTTC 
TTTGCCCTTTACTTTTCCTAACTTTGGTCTTTA 

CACACATAAGTTAAAACTATTACAACAGTTTTAAAGAGAGAGATTTAAAAAATGGAGACA 
GAGAAGAAAGTTTCTCTCCCAAGAATCTTACGAATCTCTGTTACTGATCCTTACGCAACA 
GATTCGTCAAGCGACGAAGAAGAAGAAGTTGATTTTGATG(^TTATCTACAAAACGACGT 
* CGTGTTAAGAAGTACGTGAAGGAAGTGGTGCTTGATTCGGTGGTTTCTGATAAAGAGAAG 
CCGATGAAGAAGAAGAGAAAGAAGCGCGTTGTTACTGTTCCAGTGGTTGTTACGACGGCG 
ACGAGGAAGTTTCGTGGAGTGAGGCAAAGACCGTGGGGAAAATGGGCGGCGGAGATTAGA 
GATCCGAGTAGACGTGTTAGGGTTTGGTTAGGTACTTTTGACACGGCGGAGGAAGCTGCC 
ATTGTTTACGATAACGCAGCTATTCAGCTACGTGGTCCTAACGCAGAGCTTAACTTCCCT 
CCTCCTCCGGTGACGGAGAATGTTGAAGAAGCTTCGACGGAGGTGAAAGGAGTTTCGGAT 
TTTATCATTGGCGGTGGAGAATGTCTTCGTTCGCCGGTTTCTGTTCTCGAATCTCCGTTC 
TCCGGCGAGTCTACTGCGGTTAAAGAGGAGTTTGTCGGTGTATCGACGGCGGAGATTGTG 
GTTAAAAAGGAGCCGTCTTTTAACGGTTCAGATTTCTCGGCGCCGTTGTTCTCGGACGAC 
GACGTTTTTGGTTTCTCGACGTCGATGAGTGAAAGTTTCGGCGGCGATTTATTTGGAGAT 
AATCTTTTTGCGGATATGAGTTTTGGATCCGGGTTTGGATTCGGGTCTGGGTCTGGATTC 
TCCAGCTGGCACGTTGAGGACCATTTTCAAGATATTGGGGATTTATTCGGGTCGGATCCT 
GTCTTAACTGTTTAAGAAATAACTGGCCGTTTAACGGCGTTTAGTGAAGTTTTGTTACCG 
GCGACGGCGAGGATTAAAAAAAAACGGCGATTTATTTTTTGAATGAAGATTTGTTAAATA 
>G1421 Amino Acid Sequence (domain in AA coordinates: 74-151) 
METEKKVSLPRILRISVTDPYATDSSSDEEEEVDFDALST 
KEKPMKKKRKK31VVTVPVVVTTATRKFRGVRQRPWGKWAAEIRDP 

EAAIV^NAAIQLRGPNAELNFPPPPVTENVEEASTEVKGVSDFIIGGGECLRSPVSVLE 
SPFSGESTAVKEEFVGVSTAEIWKKEPSFNGSDFSAPLFSDDDVFGFSTSMSESFGGDL 
FGDNLFADMSFGSGFGFGSGSGFSSWHVEDHFQDIGDLFGSDPVLTV* 
>G1453 (39.. 917) 

CGTCGACGCGAAATAAATCCTAGAAAATAACTATCAATATGATGAAGGTTGATCAAGATT 
ATTCGTGTAGTATACCGCCTGGATTTAGGTTTCATCCGACAGATGAAGAACTTGTCGGAT 
ATTATCTCAAGAAGAAAATCGCCTCCCAGAGGATTGATCTCGACGTTATCAGAGAAATTG 
ATCTTTACAAGATCGAACCATGGGATCTACAAGAGAGATGTAGGATAGGGTACGAGGAGC 
AAACGGAGTGGTATTTCTTCAGCCATAGAGACAAGAAGTATCCGACTGGGACTAGGACAA 
ACCGAGCCACCGTGGCCGGTTTCTGGAAAGCAACGGGCCGGGACAAGGCGGTTTACCTCA 
ACTCCAAACTTATCGGTATGAGAAAAACGCTTGTCTTTTACCGAGGTCGAGCGCCTAATG 
GCCAT^AAGTCCGATTGGATCATTCACGAATACTACAGCCTCGAGTCACACCAGAACTCTC 
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CTCCACAGGAAGAAGGATGGGTAGTGTGTAGAGCATTTAAGAAACGAACGACCATCCCAA 
CAAAAAGGAGGCAACTTTGGGATCCGAACTGCTTATTCTACGACGACGCCACTCTCTTGG 
AACCTCTCGACAAGCGAGCCAGACATAATCCTGATTTTACCQCCACACCGTTCAAGCAAG 
AACTACTCTCCGAGGCCAGTCACGTCCAGGATGGAGATTTCGGATCTATGTACCTTCAAT 
GCATCGATGATGATCAATTCTCCCAGCTTCCTCAGCTCGAGAGCCCCTCTCTTCCGTCGG 
AAATAACTCCCCATAGTACTACTTTTTCTGAGAACAGTAGCCGGAAAGATGACATGAGCT 
CCGAGAAGAGGATCACTGACTGGAGATATCTAGATAAGTTCGTGGCGTCTCAATTTTTGA 
TGAGTGGAGAAGACTAAAAAAGGCTTTCCTATGCATGCATGCACTAGAAACGTCGTCGCA 

TTTTGGATTTACATGCGGCCGCT 

>G1453 Amino Acid Sequence (conserved domain in AA coordinates: 
MMKVX)QDYSCSIPPGFRFHPTDEELVGYYLKKKIASQRIDLDVIREIDLYKIEPWDLQER 

CRIGYEEQTEWYFFSHRDKKYPTGTRTNRATVAGFW^ 

YRGRAPNGQKSDWIIHEYYSLESHQNSPPQEEGWWCRAFKKRTTIPTKRRQLWDPNCLF 
YDDATLLEPLDKRARHNPDFTATPFKQELLSEASHVQDGDFGSMYLQCIDDDQFSQLPQL 
ESPSLPSEITPHSTTFSENSSRKDDMSSEKRITDWRYLDKFVASQFLMSGED* 
>G1560 (120.. 1340) 

ATCCTTTCAATTTCCACTCCTCTCTAATATAATTCACATTTTCCCACTATTGCTGATTCA 
TTTTTTTTTGTGAATTATTT 

TGGATCCTTCATTTAGGTTCATTAAAGAGGAGTTTCCTGCTGGATTCAGTGATTCTCCAT 
CACCACCATCTTCTTCTTCATACCTTTATTCATCTTCCATGGCTGAAGCAGCCATAAATG 
ATCCAACAACATTGAGCTATCCACAACCATTAGAAGGTCTCCATGAATCAGGGCCACCTC 
CATTTTTGACAAAGACATATGACTTGGTGGAAGATTCAAGAACCAATCATGTCGTGTCTT 

GGAGC^tfVATCC^TAACAGCTTC^TTC^ 
TTCCCAGATTCTTCAAGCACAATAACTTCTC 

GTTTCAGAAAGGTGAATCCGGATCGGTGGGAGTTTGCAAACGAAGGGTTTCTTAGAGGGC 
AAAAGCATCTCCTCAAGAACATAAGGAGAAGAAAAACAAGTAATAATAGTAATCAAATGC 
AACAACCTCAAAGTTCTGAACAACAATCTCTAGACAATTTTTGCATAGAAGTGGGTAGGT 
ACGGTCTAGATGGAGAGATGGACAGCCTAAGGCGAGACAAGCAAGTGTTGATGATGGAGC 
TAGTGAGACTAAGACAGCAACAACAAAGCACCAAAATC^ 

AGCTCAAGAAGACCGAGTCAAAACAAAAACAAATGATGAGCTTCCTTGCCCGCGCAATGC 
AGAATCCAGATTTTATTCAGCAGCTAGTAGAGCAGAAGGAAAAGAGGAAAGAGATCGAAG 
AGGCGATCAGCAAGAAGAGACAAAGACCGATCGATCAAGGAAAAAGAAATGTGGAAGATT 
ATGGTGATGAAAGTGGTTATGGGAATGATGTTGCAGCCTCATCCTCAGCATTGATTGGTA 
TGAGTCAGGAATATACATATGGAAACATGTCTGAATTCGAGATGTCGGAGTTGGACAAAC 
TTGCTATGCACATTCAAGGACTTGGAGATAATTCCAGTGCTAGGGAAGAAGTCTTGAATG 
TGGAAAAAGGAAATGATGAGGAAGAAGTAGAAGATCAACAACAAGGGTACCATAAGGAGA 
ACAATGAGATTTATGGTGAAGGTTTTTGGGAAGATTTGTTAAATGAAGGTCAAAATTTTG 
ATTTTGAAGGAGATCAAGAAAATGTTGATGTGTTAATTCAGCAACTTGGTTATTTGGGTT 
CTAGTTCACACACTAATTAAGAAGAAATTGAAATGATGACTACTTTAAGCATTTGAATCA 
ACTTGTTTCCTATTAGTAATTTGGCTTTGTTTCAATCAAGTGAGTCGTGGACTAACTTGC 

>G1560 Amino Acid Sequence (domain in AA coordinates: 62-151) 
MDPSFRFIKEEFPAGFSDSPSPPSSSSYLYSSSMAEAAINDPTTLSYPQPLEGLHESGPP 
PFLTKTYDL VTSDSRTNHWS W SKSNNS F I VWDPQAFS VTLLPRF FKHNNFS S FVRQLNTY 
GFRKWPDRWEFANEGFLRGQKHLLKNIRRRKTSNNSNQMQQPQSSEQQSLDNFCIEVGR 

YG^GEMDSLRRDKQVL^ELWLRQQQ 

QNPDF IQQLVEQKEKRKE I EEAI S KKRQRP IDQGKRNVEDYGDESGYGNDVAAS S SALI G 

MSQBYTYGNMSEFEMSELDKLAMHIQGLGDNSSAREEV^ 

NNEIYGEGFWEDLLJ?EGQNFDFEGDQENVDVLIQQLGYLGSSSHTN* 

>G1594 (1..984) 

ATGGATGGAATGTACAATTTCCATTCGGCCGGTGATTATTCAGATAAGTCGGTTCTGATG 
ATGTCACCGGAGAGTCTCATGTTTCCTTCCGATTACCAAGCTTTGCTATGTTCCTCCGCC 
GGTGAAAATCGTGTCTCTGATGTTTTCGGATCCGACGAGCTACTCTCAGTAGCCGTCTCC 
GCTTTGTCGTCGGAGGCGGCTTCGATCGCTCCGGAGATCCGAAGAAATGATGATAACGTT 
TCTCTAACTGTCATCAAAGCTAAAATCGCTTGTCATCCTTCGTATCCTCGCTTACTTCAA 
GCTTACATCGATTGCCAAAAGGTCGGAGCACCACCGGAGATAGCGTGTTTACTAGAGGAG 
ATTCAACGGGAGAGTGATGTTTATAAGCAAGAGGTTGTTCCTTCTTCTTGCTTTGGAGCT 
GATCCTGAGCTTGATGAATTTATGGAAACGTACTGCGATATATTAGTGAAATACAAATCG 
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GATCTAGCAAGACCGTTTGACGAGGCAACGTGTTTCTTGAACAAGATTGAGATGCAGCTA 
CGGAACCTATGTACTGGTGTCGAGTCTGCCAGGGGAGTTTCTGAGGATGGTGTAATATCA 
TCTGACGAGGAACTGAGTGGAGGTGATCATGAGGTAGCAGAGGATGGGAGACAAAGATGT 
GAAGACCGGGACCTCAAAGATAGGTTGCTACGCAAATTTGGAAGCCGTATTAGTACTTTA 
AAGCTTGAGTTCTCAAAGAAGAAGAAGAAAGGAAAGTTACCAAGAGAAGCAAGACAAGCT 
CTTCTTGATTGGTGGAATCTCCATTATAAGTGGCCTTACCCTACTGAAGGAGATAAGATA 
GCATTAGCTGATGCAACGGGGTTAGACCAAAAACAAATCAACAATTGGTTTATAAACCAA 
AGGAAACGTCATTGGAAGCCATCAGAGAATATGCCTTTCGCTATGATGGATGATTCTAGT 
GGATCATTCTTTACCGAGGAATGA 

>G1594 Amino Acid Sequence (conserved domain in AA coordinates : 343-308) 

MDGMYNFHSAGDYSDKSVTjMMSPESLMFPSDYQALLCSSAGENRVSDVFGSDELLSVAVS 

ALSSBAASIAPEIRRJTODNVSLTVIKAKIACHPSYPRLLQAYIDCQKVGAPPEIAC 

IQRESDVYKQEVVPSSCFGADPELDEFMETYCDILVTCYKSDLARPFDEA 

RNLCTGVES ARGVSEDG VI S SDEELSGGDHEVAEDGRQRCEDRDLKDRLLRKFGSRI STL 

KLEFSKKKKKGKIiPREARQALLDWWNLHYKWPYPTEGDKIALADATGLDQK^ 

RKRHWKPSENMPFAMMDDSSGSFFTEE* 

>G1750 (94.. 1101) 

CCCTTTTCCTCTCTTTCTCCAAATCTCTGAAAATTTTCACCAGAATCTCTGTTCTTTTTT 

TCACCAGAATCTCTCTGTTTAAAATAATAGGTGATGATGATGGATGAGTTTATGGATCTT 

AGACCAGTGAAGTACACAGAGCAC^^GACTGTTATCAGAAAGTACACTAAAAAGTCGTCT 

ATGGAGAGGAAGACCAGTGTTCGTGACTCGGCCAGGTTGGTTCGGGTCTCAATGACGGAT 

CGTGACGCCACTGATTCATCAAGCGACGAGGAAGAGTTTCTGTTCCCTCGAAGACGTGTC 

AAGAGATTGATTAACGAGATCAGAGTCGAGCCTAGCAGCTCTTCCACCGGCGACGTCTCT 

GCTTCTCCGACGAAGGACCGGAAAAGAATCAACGTTGATTCTACGGTTCAAAAGCCCTCT 

GTTTCCGGCCAAAACCAGAAGAAGTACCGCGGCGTGAGACAGCGACCATGGGGAAAATGG 

GCGGCGGAGATTCGTGATCCTGAGCAACGCCGGAGAATCTGGCTCGGTACTTTTGCAACG 

GCGGAGGAAGCTGCCATCGTCTACGACAACGCAGCAATCAAACTTCGTGGCCCTGATGCT 

CT TAC CAACTTCACCGTACAAC CAGAAC CAG AACCGGTACAAGAACAAGAACAAGAACCG 

GAGAGCAACATGTCGGTTTCGATATCAGAATCAATGGACGATTCTCZAACATCTATCATCT 

CCGACATCGGTTCTCAACTACCAAACATATGTCTCGGAGGAACCAATCGATAGTCTTATC 

AAACCGGTTAAACAAGAGTTTCTTGAACCAGAACAAGAGCCAATAAGCTGGCATCTTGGA 

GAAGGTAATACTAATACTAATGATGATTCATTTCCATTGGACATTACATTTCTCGACAAC 

TATTTCAATGAATCATTACCAGACATCTCCATCTTCGATCAACCTATGTCTCCTATTCAA 

CCAACAGAGAATGATTTCTTCAACGACCTTATGTTATTCGATAGCAACGCAGAAGAATAC 

TACTCCTCCGAGATCAAAGAGATTGGTTCATCGTTCAACGATCTTGATGATTCTTTGATA 

TCCGATCTCTTACTTGTGTGATATTTTTGCCATTAACCAAACACCGGTTTGGTTGC 

>G1750 Amino Acid Sequence (domain in AA coordinates: 107-173) 

MMMDEFMDLRPVTCYTEHKTVIRKYTK 

EFLFPRRRVKRLINEIRVTSPSSSSTGDVSASPTKDRKRINVTDST^ 

TOQRPWGKWAAEIRDPEQRRRIWLGTFATAEEAAIVYDNAAIKLRGPDALTNFTVQPEPE 

PVQEQEQEPESNMSVSISESMDDSQHLSSPTSVLNYQTYVSEEPIDSLIKPVKQEFLEPE 

QEPISWHLGEGNTNTNDDSFPLDITFLDNYFNESLPDISIFDQPMSPIQPTENDFFNDLM 

LFDSNAEEYYSSEIKEIGSSFNDLDDSIjISDLLLV* 

>G1947 (70.. 918) 

ACAACTATTCTCTCCTCTCTCTTTTTTTATTAAAAAAGCTCAAATTTATATAGGTTTTTT 

GTTCACAAAATGGATTATAACCTTCCAATTCCATTAGAGGGTCTCAAAGA^ 

ACGGCTTTCTTGACGAAAACATACAACATAGTGGAGGATTCAAGC^CAAACAACATAGTT 

TCATGGAGCAGAGAeAACAACAGCTTCATTGTTTGGGAACCAGAGACTTTTGCCCTAATT 

TGCCTCCCTAGATGCTTTAAGCACAATAATTTCTCCAGCTTTGTTAGACAGCTCAATACT 

TATGGGTTTAAGAAGATTGATACAGAGAGATGGGAATTTGCAAATGAGCATTTTCTGAAG 

GGAGAGAGGCATCTTCTTAAGAACATCAAGAGAAGAAAGACATCATCTCAAACGCAAACG 

CAGTCGCTAGAAGGAGAGATCCATGAGCTGCGAAGAGACAGAATGGCTTTAGAAGTAGAA 

CTGGTTAGACTGCGACGAAAACAAGAAAGCGTGAAGACATATCTGCATTTGATGGAAGAG 

AAACTGAAAGTCACAGAAGTAAAGCAAGAAATGATGATGAATTTCTTGCTAAAGAAGATT 

AAGAAACCGAGTTTTTTACAGAGCTTAAGGAAACGTAATCTGCAAGGAATCAAGAATCGA 

GAGCAAAAGCAAGAGGTGATCTCAAGCCATGGTGTTGAGGATAATGGAAAGTTTGTTAAA 

GCTGAGCCAGAAGAGTATGGTGATGACATCGATGATCAATGTGGAGGTGTGTTTGATTAT 
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GGTGATGAGCTTCACATAGCTTCAATGGAGCATCAAGGACAAGGGGAGGATGAAATTGAA 
ATGGATAGTGAAGGAATTTGGAAGGGTTTCGTGTTGAGTGAGGAGGAGATGTGTGA1T?TA 
GTGGAACATTTTATATAATAAACTAATGTATTATGAGAGGT^ 
TTTTTTTTCCGAGTTTGTCATCAAGCA^ 

CAAAATATTTGGCCTTGG(^TTTGTTAACAAATTGACTAATTCGGCCAC 

>G1947 Amino Acid Sequence (domain in AA coordinates: 37-120) 

OTYNLPIPLEGLKETPPTAFLTKTYNIVEDSSTNNIVSWSRDNNSFIVWEPETFALICLP 

RCFKHNNFSSFWQLNTYGFKKIDTERWEFA 

EGE IHELRRDRMALEVELVRLRRKQES VKTYLHLMEEKLKVTEVKQEMM KKP 
SFLQSLRKRNLQGIKNREQKQEVISSHGVEDNGKFVKAEPEEYGDDIDDQCGGVFDYGDE 
LHIASMEHQGQGEDEIEMDSEGIWKGFVLSEEEMCDLVEHFI* 
>G2011 (309.. 1547) 

AATGTCGGTTGTACAATTATTTGTCACTAAAGTTTCCAAATTTCTTCTAAACTGATGAAT 

CAATGGAACATGATGACGAAAAAGATAAATCCACGGTGGCGGGAACTGACCCACCCATTT 

CCACCGCCTCTCTATTCCCCAGATTTTTTTCAATTATCTGACTACAG 

TCCTTCCCTAAACCTTTATAAACCATTAAACCTCTCATCCTTCTTCTCTTAAACCCCCTA 

ATTATCACACAC^CCCCAATTTCTCACTCTCTCTCTCACTAAAACCCGTAAATTTTCTAC 

TATATCAAATGAGCCCAAAAAAAGATGCTGTTTCTAAACCAACTCCAATTTCAGTACCCG 

TTTCGAGACGATCCGATATACCCGGGTCTCTCTACGTCGACACTGACATGGGTTTCTCTG 

GGTCACCACTTCCCATGCCACTAGACATCTTACAAGGGAATCCAATTCCACCTTTTTTAT 

CCAAGACTTTTGATTTGGTTGATGACCCGACTCTTGACCCGGTCATCTCTTGGGGACTGA 

CCGGAGCTAGCTTCGTAGTTTGGGATCCTCTAGAGTTTGCCAGAATCATACTTCCAAGGA 

ATTTCAAACACAAC^TTTCTCCAGCTTCGTCAGACAGCTT^CACTTATGGATTTCGAA 

AGATTGATACTGACAAGTGGGAATTCGCTAACGAGGCTTTCCTTAGAGGCAAGAAGCATC 

TTCTGAAGAACATTCATCGTCGTCGATCACCACAATCCAACCAAACTTGCTGCAGTAGCA 

CTAGCCAAAGCCAAGGGTCACCTACTGAGGTTGGAGGAGAGATTGAGAAGCTGAGGAAAG 

AGCGGCGTGCATTGATGGAGGAAATGGTTGAGCTTCAGCAGCAAAGCAGAGGCACAGCTC 

GACATGTGGACACTGTAAACCAGAGGCTGAAAGCTGCAGAGCAACGTCAGAAGCAATTGC 

TCTCTTTCTTGGCTAAGTTGTTTCAGAACCGGGGTTTCTTGGAACGCCTGAAGAACTTCA 

AAGGAAAAGAAAAAGGAGGAGCTCTTGGATTGGAAAAGGCGAGAAAGAAGTTCATCAAGC 

ACCACCAGCAGCCTCAAGATTCTCCAACAGGAGGGGAGGTGGTGAAGTATGAAGCTGATG 

ATTGGGAGAGATTGCTAATGTATGACGAAGAGACTGAGAACACCAAGGGTTTAGGAGGGA 

TGACTTCAAGCGATCGAAAAGGCAAGAACTTGATGTATCCATCAGAAGAAGAGATGAGC^ 

AACCAGATTACTTGATGTCCTTCCCATCTCCTGAAGGACTTATTAAACAAGAAGAGACGA 

CATGGAGCATGGGTTTCGATACTACAATACCGAGTTTCAGCAACACCGATGCATGGGGAA 

AC^CAATGGACTATAATGATGTCTCAGAGTTTGGTTTTGCTGCAGAAACAACAAGTGATG 

GTTTGCCTGATGTCTGCTGGGAACAATTTGCTGCAGGAATCACAGAGACTGGATTCAACT 

GGCCAACTGGTGATGATGATGATAATACGCCAATGAATGATCCTTAGGATCTTTTCATAT 

ATAGTTTAGACCAAAAACCCGTTTCTTATCGGGTGAACTATTAATTCATTATTCATTTTG 

AATGCACTCTTTATACATATATATAATATTGATGAGTTTGATTGTTCCAAAAAAAAA 

>G2011 Amino Acid Sequence (domain in AA coordinates: 56-147) 

MSPKKDAVSKPTPISVPVSRRSDIPGSLYVDTDMGFSGSPLPMPLDILQGNPIPPFLSKT 

FDLVDDPTLDPVISWGLTGASFVVWDPLEFARII^ 

TDKWEFANEAFLRGKKHLLKNIHRRRSPQSNQTCCSSTSQSQGSPTEVGGEIEKLRKERR 

ALMEEMVELQQQSRGTARHVDTWQRLKAAEQRQKQLLSFLAKLFQm^ 

EKGGALGLEKARKKFIKHHQQPQDSPTGGETVVKYE 

SDPKGKNLMYPSEEEMSKPDYLMSFPSPEGLIKQEETTWSMGFDTTIPSFSNTDAWGNTM 
DYNDVSEFGFAAECTSDGLPDVCWEQFAAGITETGFNWPTGDDDDNTPMNDP* 
>G2094 (1..450) 

ATGCTAGATCCCACCGAGAAAGTAATCGATTC^GAATCAATGGAAAGCAAACTCACATCA 
GTAGATGCGATCGAAGAACACAGCAGCAGTAGCAGTAATGAAGCTATCAGCAACGAGAAG 
AAGAGTTGTGCCATTTGTGGTACCAGCAAAACCCCTCTTTGGCGAGGCGGTCCTGCCGGT 
CCCAAGTCGCTTTGTAACGCATGCGGGATCAGAAACAGAAAGAAAAGAAGAACACTGATC 
TCAAATAGATCAGAAGATAAGAAGAAGAAGAGTCATAACAGAAACCCGAAGTTTGGTGAC 
TCGTTGAAGCAGCGATTAATGGAATTGGGGAGAGAAGTGATGATGCAGCGATCAACGGCT 
GAGAATCAACGGCGGAATAAGCTTGGCGAAGAAGAGCAAGCCGCCGTGTTACTCATGGCT 
CTCTCTTATGCTTCTTCCGTTTATGCTTAA 
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>G2094 Amino Acid Sequence (domain in AA coordinates : 43-68) 

MLDPTBKVIDSESMESKLTSVDAIEEHSSSSSNEAISNEKKSCAICGTSKTPLWRGGPAG 

PKSLCNACGIRNRKKRRTLISNRSEDKKKKS 

ENQRRNKLGEEEQAAVLLMALSYASSVYA* 

>G2113 (90.. 590) 

ATAACAAACTCATCAAACTTCCTCAGCGTTTCTTTTTCTTACATAAACAATTTTTCTTAC 

ATAAACAAATCTTGTTGTTTGTTGTTGTCATGGCACCGACAGTTAAAACG 

AAACCAACGAAGGTAACGGAGTCCGTTACAGAGGAGTGAGGAAGAGACCATGGGGACGTT 

ACGCAGCCGAGATCAGAGATCCTTTCAAGAAGTCACGTGTCTGGCTCGGTACTTTCGACA 

CTCCTGAAGAAGCCGCTCGTGCCTACGACAAACGTGCTATTGAGTTTCGTGGAGCTAAAG 

CCAAAACCAACTTCCCTTGTTACAACATCAACGCCC^^ 

TGAGCCAGAGCAGCACCGTGGAATCATCGTTTCCTAATCTCAACCTCGGATCTGACTCTG 
TTAGTTCGAGATTCCCTTTTCCTAAGATTQAGGTTAAGGCTGGG 

AAAGGAGTGAATCGGATTCTTCGTCGGTGGTGATGGATGTCGTTAGATATGAAGGACGAC 

GTGTGGTTTTGGACTTGGATCTTAATOT^^ 

ATTATGATTATTAGATATAATTAAATGTTTCTGAATTGAG 

>G2113 Amino Acid Sequence (domain in AA coordinates: TBD) 
MAP WKTAAVKTNEGNGVRYRGVRKRPWGRYAAE I 

KRAIEFRGAKAKTNPPCYNINAHCLSIjTQSLSQSSTVESSFPNLNLGSDSVSSRFPFPKI 

QVKAGMMVFDERSESDSSSVV^VTOYEGRRVVLDLDLNFPPPPEN* 

>G2115 (41.. 733) 

AATCACTCTACAAAGCCTGTACGTACACAACAACATTACCATGGTGAAACAAGAACGCAA 
GATCCAAACCAGCAGCACAAAAAAGGAAATGCCTTTGTCAT 

TTCTTCATCTTCTTCCTCGTCTTCGTCTTCGTGTAAGAACAAGAACAAGAAGAGTAAGAT 

TAAGAAGTACAAAGGAGTGAGGATGAGAAGTTGGGGATCATGGGTCTCTGAGATTAGGGC 

ACCAAATCTU^AAGACAAGGATTTGGTTAGGTTCTTACTCAACAGCTGMGCAGCTGCT 

AGCTTACGATGTTGCACTCTTATGTCTCAAAGGCCCTCAAGCCAATCTCAACTTCCCTAC 

TTCTTCTTCTTCTCATCATCTTCTTGATAATCTCTTAGATGAAAATACCCTTTTGTCCCC 

CAAATCCATCCAAAGAGTAGCTGCTCAAGCTGCCAACTCATTTAACCATTTTGCCCCTAC 

TTCATCAGCCGTCTCGTCACCGTCCGATCATGATCATCACCATGATGATGGGATGCAATC 

TTTGATGGGATCTTTTGTGGACAATCATGTGTCTTTGATGGATTCAACATCTTCATGGTA 

TGATGATCATAATGGGATGTTCTTGTTTGATAATGGAGCTCCATTCAATTACTCTCCTCA 

ACTAAACTCGACGACGATGCTCGATGAATACTTCTACGAAGATGCTGACATTCCGCTTTG 

GAGTTTCAATTAATCCGACGGTCCATAATACATACTTTAATTAGT 

>G2115 Amino Acid Sequence (conserved domain in AA coordinates : 46-115) 

1WKQERKIQTSSTKKEMPLSSSPSSSSSSSSSSSSSSCKNKNKKSKIKKYKGVRMRSWGS 

WSEIRAPNQKTRIWLGSYSTAEAAARAYDVALLCLKGPQANLNFPTSSSSHHLLDNLLD 

ENTLLS PKS I QRVAAQAANS FNHFAPTS S AVSS P SDHDHHHDDGMQSLMGS FVDNHVSLM 

DSTSSWYDDHNGMFLFDNGAPFNYSPQLNSTTMLDEYFYEDADIPLWSFN* 

>G2130 (41.. 988) 

CCTCTCTTCATTTTTTAACTCCCTCTCTCTCTCTCTCTCTATGGAGAGACGAACGAGACG 
AGTGAAGTTCACAGAGAATCGTACGGTCACAAACGTAGCAGCTACACCATCTAACGGGTC 
TCCGAGACTGGTCCGTATCACTGTTACTGATCCTTTCGCTACTGACTCGTCTAGCGACGA 
CGACGACAACAACAACGTCACGGTGGTTCCAAGAGTGAAACGATACGTGAAGGAGATTAG 
ATTCTGCCAAGGTGAATCTTCTTCCTCCACCGCGGCGAGGAAAGGTAAGCACAAGGAGGA 
GGAAAGCGTAGTGGTTGAAGATGACGTGTCGACGTCGGTGAAGCCTAAAAAGTACAGAGG 
CGTGAGACAGAGACCTTGGGGAAAATTCGCGGCGGAGATTAGAGATCCGTCGAGCCGTAC 
TCGGATTTGGCTTGGGACTTTTGTCACGGCGGAGGAAGCTGCTATAGCGTACGATAGAGC 
CGCGATTCATCTCAAAGGACCTAAAGCGCTCACGAATTTCCTAACTCCGCCGACGCCAAC 
GCCGGTTATCGATCTCCAAACGGTTTCCGCCTGCGATTACGGTAGAGATTCTCGGCAGAG 
CCTTCATTCACCGACCTCTGTTCTAAGATTCAACGTCAACGAGGAAACAGAGCATGAGAT 
TGAAGCGATCGAGCTATCTCCGGAGAGAAAGTCGACGGTTATAAAAGAAGAAGAAGAATC 
GTCGGCGGGTTTGGTGTTCCCGGATCCGTATCTGTTACCGGATTTATCTCTCGCCGGCGA 
ATGTTTTTGGGATACCGAAATTGCCCCTGACCTTTTGTTTCTCGATGAAGAAACCAAAAT 
CCAATCAACGTTGTTACCAAACACAGAGGTTTCGAAACAAGGAGAAAACGAAACTGAAGA 
TTTCGAGTTTGGTTTGATTGATGATTTCGAGTCTTCTCCATGGGATGTGGATCATTTCTT 
CGACCATCATCATCACTCTTTCGATTAAAAATCTCTTCTTTTTTGGGGAAATTTTTGTG 
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>G2130 Amino Acid Sequence (domain in AA coordinates 93-160) 

MERRTRRVKFTENRTVTNVAATP SNGS PRLW ITVTDPFATDSS SDDDDNNNVTVVPRVK 

RYVKEIRFCQGESSSSTAARKGKHKEEESVVVEDDVSTSVKPKKYRGVRQRPWGKFAAEI 

RDPSSRTRIWLGTFVTAEEAAIAYDRAAIHLKGPKALTNFLTPPTPTPVIDLQTVSACDY 

GRDSRQSLHSPTSVLRFNVNEETEHEIEAIELSPERKSTVIKEEEESSAGLVFPDPYLLP 

DLSLAGECFWDTEIAPDLLFLDEETKIQSTLLPNTEVSKQGENETEDFEFGLIDDFESSP 

WDVDHFFDHHHHSFD* 

>G2147 (162.. 1262) 

CTGTGATTGTCAAGAGTTTGAACACACAAAGAAGAAAGAAGAACTCAACATTTCAAGCAA 
GAAGAAAGAGAGAAGAGAGAAGGTCCAATAATAGAGAGAACAAAAAAAAAGAGAGCTTAA 
TTGTCAGTTTATTCTCTGCAAACGTGCGGCCTAAGTAACACATGTCGAATTATGGAGTTA 
AAGAGCTCACATGGGAAAATGGGCAACTAACCGTTCATGGTCTAGGCGACGAAGTAGAAC 
CAACCACCTCGAATAACCCTATTTGGACTCAAAGTCTCAACGGTTGTGAGACTTTGGAGT 
CTGTGGTTCATCAAGCGGCTCTACAGCAGCCAAGCAAGTTTCAGCTGCAGAGTCCGAATG 
GTCCAAACCACAATTATGAGAGCAAGGATGGATCTTGTTCAAGAAAACGCGGTTATCCTC 
AAGAAATGGACCGATGGTTCGCTGTTCAAGAGGAGAGCC^TAGAGTTGGCCACAGCGTCA 
CTGCAAGTGCGAGTGGTACCAATATGTCTTGGGCGTCTTTTGAATCCGGTCGGAGCTTGA 
AGACAGCTAGAACCGGAGACAGAGACTATTTCCGCTCTGGATCGGAAACTCAAGATACTG 
AAGGAGATGAACAAGAGACAAGAGGAGAAGCAGGTAGATCTAATGGACGACGGGGACGAG 
CAGCAGCGATTCACAACGAGTCCGAAAGGAGACGGCGTGATAGGATAAACCAGAGGATGA 
GAACACTTCAGAAGCTGCTTCCTACTGCAAGTAAGGCGGATAAAGTCTCAATCTTGGATG 
ATGTTATCGAACACTTGAAACAGCTACAAGCACAAGTACAGTTCATGAGCCTAAGAGCCA 
ACTTGCCACAACAAATGATGATTCCGCAACTACCT 

AACACC^CAACAACAACAACAACAGC^GCAGCAGCAGCAAC^CAGCAGCAACAGTTTC 

AGATGTCGTTGCTTGCAACAATGGCAAGAATGGGAATGGGAGGTGGTGGAAATGGTTATG 

GAGGTTTAGTTCCTCCTCCTCCTCCTCCACCAATGATGGTCCCTCCTATGGGTAACAGAG 

ACTGCACCAACGGTTCTTC^GCCACATTATCTGATCCATACAGCGCCTTTTTCGCACAGA 

C^TGAATATGGATCTCTAC^TAAAATGGC^GCAGCTATCTATAGACAACAGTCTGATC 

AAACAACAAAGGTAAATATCGGCATGCCTTCAAGTTCTTCGAATCATGAGAAAAGAGATT 

AGTCTAGCGACCTAGTATTATTGATCCATATATATAGTTCTTGAAAGATTGTTGTATCAT 

GATTGTAAAAACTGTTTTGAGTATGGAAAAAGACTTG CAGATAAAA 

>G2147 Amino Acid Sequence (domain in AA coordinates : 160-234) 

MSNYGVKELTWENGQLTVHGLGDE 

QLQSPNGPNHNYESKDGSCSRKRGYPQEMDRWFAVQEESHRVGHSVTASASGTNMSWASF 

ESGRSLKTARTGDRDYFRSGSETQDTEGDEQETRGEAGRSNGRRGRAAAIHNESERRRRD 

RINQRMRTLQKLLPTASKADKVSILDDVIEHLKQLQAQVQFMSLRANLPQQMMIPQLPPP 

QSVLSIQHQQQQQQQQQQQQQQQQQFQMSLIxATMARMGMGGGGNGYGGLVPPPPPPPMMV 

PPMGNRDCTNGSSATI»SDPYSAFFAQTMNMDLYNKMAAAIYRQQSDQTTKVNIGM 

NHEKRD* 

>G2156 (384.. 1292) 

TTTTTTTTCCCTTTCCTCGTTCAAAAAAAGTACTTGCAGAGTCACTCACTCTCAGTCTCA 
Gt^CATGAATTAATTTGAAGCTTCCCTAGAATTCTTTCACATCAATTAATACGACACCGT 
CTCGGGTGAAGAATCTCTCCTCTCTTGCCCTAAAGCGAGTTAGGGTTTAACACACAAAGC 
ATACCCTTTAGATTTGTGTCTCTTAGCTCTGTTTTTGTCGGCTTGTGTAACCGATCAACT 
CAAGCTATTGGCTCCTCACCTCCTGAAATTTGACTTCTCCAATGGATCTCAAAGTTTCTC 
TTATATGAATTCTATCTTCACCCTCACAATATCTTTATATATATGAGCCACAAGAACAAG 
AAGAGTCAGTAGATGCGGCTGCCATGGACGGTGGTTACGATCAATCCGGAGGAGCTTCTA 
GATACTTTCACAACCTCTTCAGGCCTGAGCTTCATCACCAGCTTCAACCTCAGCCTCAAC 
TTCACCCTTTGCCTGAGCCTCAGCCTCAACCTGAGCCTCAGCAGCAGAATTCAGATGATG 
AATCTGACTCCAACAAGGATCCGGGTTCCGACCCAGTTACCTCTGGTTCAACCGGGAAAC 
GTCCACGTGGACGTCCTCCGGGATCCAAGAACAAGCCGAAGCCACCGGTGATAGTGACTA 
GAGATAGCCCC^CGTGCTTAGATOTCATGTTCTTGAAGTCTCATCTGGAGCCGACATAG 
TCGAGAGCGTTACCACTTACGCTCGCAGGAGAGGAAGAGGAGTCTCCATTCTCAGTGGTA 
ACGGCACGGTGGCTAACGTCAGTCTCCGGCAGCCGGCAACGACAGCGGCTCATGGGGCAA 
ATGGTGGAACCGGAGGTGTTGTGGCTCTACATGGT^GGTTTGAGATACTTTCCCTCACAG 
GTACGGTGTTGCCGCCCCCTGCGCCGCCAGGATCCGGTGGTCTTTCTATCTTTCTTTCCG 
GCGTTCAAGGTCAGGTGATTGGAGGAAACGTGGTGGCTCCGCTTGTGGCTTCGGGTCCAG 
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TGATACTAATGGCTGCATCGTTCTCTAATGCAACTTTCGAAAGGCTTCCCCTTGAAGATG 
AAGGAGGAGAAGGTGGAGAGGGAGGAGAAGTTGGAGAGGGAGGAGGAGGAGAAGGTGGTC 
CACCGCCGGCCACGTCATCATCACCACCATCTGGAGCCGGTCAAGGACAGTTAAGAGGTA 
ACATGAGTGGTTATGATCAGTTTGCCGGTGATCCTC7VTTTGCTTGGTTGGGGAGCCGCAG 
CCGCAGCCGCACCACCAAGACC^GCCTTTTAGAATTGAAAATTATGTCCGTAACATAGCT 
GTAACCAAATTTCATTTCTCAAAATTAAAAGAAAAAAAAAA 

>G2156 Amino Acid Sequence (domain in AA coordinates : 66-86) 

MDGGYDQSGGASRYFHNLFRPELHHQLQPQPQLHPLPQPQPQPQPQQQNSDDESDSNKDP 

GSDPVTSGSTGKRPRGRPPGSKNKPKPPVIVTRDSPNVI^SHVLEVSSGADIVES 

RRRGRGVS I LSGNGTVANVSLRQPATTAAHGANGGTGGVVALHGRFEI LSLTGTVLPPPA 

PPGSGGLSIFLSGVQGQVIGGNWAPLVASGPVILMAASFSNATFERLPLEDEGGEGGEG 

GEVGEGGGGEGGPPPATSSSPPSGAGQGQLRGNMSGYDQFAGDPHLLGWGAAAAAAPPRP 

AF* 

>G2294 (24.. 659) 

TCCTCCCTTAATTAGTATCAAAAATGGTGAAAACACTTCAAAAGACACCAAAGAGAATGT 

CATCTCCATCATCATCATCTTCATCATCCTCATCAACATCATCATCATCCATAAGG 

AGAAGTACAAGGGAGTGAGAATGAGAAGTTGGGGTTCATGGGTTTCAGAGATCAGAGCTC 

CTAATCAAAAGACAAGGATCTGGCTTGGTTCTTACTC^CTGCTGAAGCCGCGGCTAGAG 

CCTACGACGCAGCACTCCTATGTCTTAAAGGATCCTCAGCTAATAATCTCAACTTCCCAG 

AGATCTCAACTTCTCTCTACCATATTATCAACAATGGTGATAACAACAATGACATGTCCC 

CTAAGTCTATACAAAGAGTAGCAGCTGCAGCTGCTGCTGCCAACACAGATCCTTCCTCZAT 

CATCAGTCTCTACTTCATCTCCATTGCTTTCCTCTCCATCTGAAGATCTCTATGATGTTG 

TCTCCATGTCACAGTATGACCAACAAGTCTCCTTGTCTGAATCATCATCATGGTACAACT 

GCTTTGATGGTGATGATCAGTTCATGTTCATTAATGGAGTCTCCGCGCCGTATTTGACAA 

CATCACTTTCTGATGATTTCTTTGAGGAAGGAGATATCAGATTATGGAACTTCTGCTGAT 

TCTACTTTCATTATACCTTATTCTTTG 

>G2294 Amino Acid Sequence (conserved domain in AA coordinates : 32-102) 
MVKTLQKTPKRMSSPSSSSSSSSSTSSSSI^^ 

LGSYSTAEAAARAYDAALLCLKGSSAHNLNFPEISTSLYHII1INGDNNNDMSPKSIQRVA 
AAAAAANTDPSSSSVSTSSPLLSSPSEDIiYDWSMSQYDQQVSLSESSSWYNCFDGDDQF 
MFINGVSAPYLTTSLSDDFFEEGDIRLWNFC* 
>G2510 (16.. 594) 

ATAACAAACTCTTTAATGTCACCACAGAGAATGAAGCTATCATCACCACCAGTTACCAAC 

AACGAACCAACCGCCACCGCTTCTGCCGTTAAATCTTGCGGCGGAGGAGGTAAAGAAACC 

AGCTCATCGACCACGAGGCATCCAGTGTACCACGGAGTTCGCAAACGCCGATGGGGAAAA 

TGGGTTTCTGAGATCAGAGAGCCCCGGAAAAAGTCTCGGATTTGGCTCGGATCTTTTCCG 

GTGCCGGAGATGGCTGCTAAGGCCTACGACGTGGCAGCGTTTTGTCTAAAAGGTAGAAAA 

GCTCAGCTGAATTTCCCTGAAGAAATCGAGGATCTACCTCGACCGTCCACGTGTACTCCC 

AGAGATATCCAAGTCGCAGCGGCCAAAGCAGCCAACGCCGTGAAGATCATCAAAATGGGA 

GATGATGACGTGGCAGGAATAGACGACGGAGATGATTTCTGGGAAGGCATTGAGCTGCCT 

GAGCTTATGATGAGTGGAGGTGGGTGGTCGCCGGAGCCTTTTGTTGCCGGAGATGATGCC 

ACGTGGCTTGTCGACGGAGACTTGTATCAGTATCAGTTCATGGCGTGTCTGTGAGTGTTG 

CTGTCGATTGTGTCGTATTCGTTATACGTGTACGTTGTATCGTTATTGTGTTGGCTCACT 

TAATTTAATGCATATGCATGTATATTTTCATTTATTTGTTTCTAGTTTATTGTTTTACGC 

GATTAATAATTAGATACCTGTTTCTCAAGTTAGTTATCAGGTTTGTACGCATCTACAA 

ATACGTATAAGTGTATGTTCTTATATACAGTTTTTGTTTGCATAAGTATTGCTACTTATT 

CTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2510 Amino Acid ' Sequence (conserved domain in AA coordinates : 41-108) 
MS PQRMKLSS PPVTNNEPTATAS AVKSCGGGGKETS S STTRHPVYHGVRKRRWGKWVSE I 
REPRKKSRI WLGS FP VPEMAAKAYDVAAFCLKGRKAQLNFPEE IEDLPRPS TCTPRD I QV 
AAAKAANAVXIIKMGDDDVAGIDDGDDFWEGIELPELMMSGGGWSPEPFVAGDDATWLVD 

GDL YQ YQFMACL * 
>G2893 (130.. 981) 

AAATCATAAAAGCCTCTCTCTTAGTCTATTTTTATCTCACGGCTCTCTCTCCCCTCTCTA 
CACACACAAACACAAATAAAGCGTAAAACTGAAATATTTTAATTACAATTAGAAAGAGAA 
CATATTAATATGTCAAATATAACAAAGAAGAAGTGTAATGGAAATGAAGAGGGTGCAGAG 
CAGAGGAAAGGGCCTTGGACACTCGAGGAAGACACTCTTCTCACCAATTACATTTCCCAT 
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AACGGTGAAGGCCGATGGAATCTGCTCGCTAAATCTTCTGGGCTAAAGAGAGCAGGAAAA 
AGTTGTAGATTGAGATGGTTGAATTACCTTAAACCCGACATAAAGCGTGGGAATCTCACT 
CCTCAAGAACAACTTTTAATCCTTGAGCTCCATTCTAAATGGGGTAATAGGTGGTCAAAA 
ATTTCGAAGTATTTACCAGGAAGAACAGACAACGATATCAAAAACTACTGGAGAACTAGA 
GTCCAGAAAC7VAGCACGCCAGCTCAACATAGATTCCAATAGCCACAAGTTCATAGAAGTT 
GTTCGTAGCTTTTGGTTTCCAAGACTGATCAACGAGATTAAAGACAACTCATACACCAAC 
AATATT7VAAGCTAATGCTCCTGATTTACTTGGACCAATTTTACGAGACAGCAAAGATTTG 
GGTTTCAACAACATGGATTGTTCCACTTCCATGTC^^ 

TTCATGGATTTTTCTGATCTTGAAACCACAATGTCCTTGGAAGGATCACGAGGGGGTAGT 
AGTCAATGTGTGAGTGAGGTTTATAGCTCCTTCCCTTGCCTAGAGGAGGAGTACATGGTG 
GCCGTTATGGGCAGTTCAGACATTTGAGC^TTC 

TACGAGGATGATGTGACACAAGATCTAATGTGGAACATGGATGACATTTGGCAGTTTAAC 

GAGTATGCACACTTTAATTAGGTTATATTATATTTATGTACTTCTTACAAC 

TTTATCGGTCTTTTATTAAATTTTGATTGTTTTGGATTCCTTAAAA 

TAGTTTTTAATGAAAAAAATGTTTAAGCGCAAAAAAAAAAAAAAAAAAAAAAAAAAAT^^ 

>G2893 Amino Acid Sequence (conserved domain in AA coordinates : 19-120) 

MSNITKKKCNGNEEGAEQRKGPWTLEEDTLIiTNYI SHNGEGRWNLLAKS SGLKRAGKS CR 

LRWLNYLKPDIKRG^TPQEQLLILELHSKWGNRWSKISKYLPGR 

QARQLNIDSNSHKFIEVWSFWFPRLINEIKDNSYT^IKANAPDLLGPILRDSKDLGFN 
NT^CSTSMSEDLKKTSQFMDFSDLETTMSLEGSRGGSSQCVSEVTSSFPCLEEEYMVAVM 
GSSDISALHDCHVADSKYEDDVTQDLMWNMDDIWQFNEYAHFN* 
>G340 (97.. 834) 

ATGAAATCTCTGTAGTTTTTTTTTGTTCCTTTCrTAAATTTCGAAAGAAAGACA 

AAAC CAAAATAAC TCTTTAGATCATTGCAAGGAAAAATGTTGAAAAGTGCAAGTC CAATG 

GCATTCTACGATATCGGAGAGCAGCAATACTCTACTTTCGGGTACATTTTAAGCAAACCT 

GGGAACGCAGGAGCTTACGAGATTGACCCTTCGATCCCAAACATCGACGATGCGATCTAC 

GGCTCAGATGAGTTCCGTATGTACGCTTACAAAATCAAACGGTGTCCTCGTACTCGTAGC 

CACGACTGGACGGAGTGTCCCTACGCTCACCGTGGCGAGAAAGCCACACGCCGTGATCCT 

CGCCGTTACACTTACTGTGCAGTCGCATGCCCGGCTTTCCGAAATGGCGCATGCCACCGT 

GGCGACTCATGCGAATTCGCACATGGCGTATTCGAGTACTGGCTCCACCCGGCGCGTTAC 

CGAACACGCGCATGTAACGCCGGGAACTTGTGTCAGAGGAAAGTGTGTTTCTTTGCCCAC 

GCGCCGGAGCAGCTAAGGCAGTCTGAAGGAAAGCACAGGTGCAGGTACGCATATAGGCCG 

GTGAGGGCTAGAGGTGGTGGAAACGGCGATGGAGTGACGATGAGAATGGACGACGAGGGT 

TACGACACGTGACGGTCTCCGGTGAGAAGCGGGAAAGATGATTTAGATAGTAACGAGGAG 

AAGGTGTTGTTGAAGTGTTGGAGTCGGATGAGCATTGTGGATGATCATTATGAGCCGTCC 

GATTTGGATTTGGATTTGTCACACTTTGATTGGATCTCAGAG 

GAAATCAAAGCAGAGAACAAAAGAAACCCGATAAATAAAGTGGATTTTGTTAAAATCCAC 
AAGATCAAGATTCAAGATGAGAGATCTTGTCATGTATATGGTAAATTTAATTC 
TTATTGCAATGTCGCAAAAGAAGTTACTTCTCTTTGCA^ 
TATAAGTCTTTGTATTAA 

>G340 Amino Acid Sequence (domain in AA coordinates: 37-154) 

MLKSASPMAFYDIGEQQYSTFGYILSKPGNAGAYEIDPSIPNIDDAIYGSDEFRMYAYKI 

KRCPRTRSHDWTECPYAHRGEKATRRDPRRYTYCAVACPAFRNGACHRGDSCEFAHGVFE 

YWLHPARYRTRACNAGNLCQRKVC FFAHAPEQLRQS EGKHRCRYAYRPVRARGGGNGDGV 

TMRMDDEGYDTSRSPVTISGKDDLDSNEEKVLLKCWSRMSIVDDHYEPSDLDLDLSHF 

SELVD* 

>G39 (75.. 638) 

GTTTCCACAGTCCC3CGTACTTGTGCATAAAACTGTAAAACACTACTCTGAAAATTTTGCT 
TCTGTTAGGATATAATGCC^CCCTCTCCrCCTAAATCTCCTTTTATTAGCTCTTCACTCA 
AAGGAGCTCATGAAGATCGCAAATTTAAATGCTATAGGGGTGTCCGAAAGAGGTCTTGGG 
GCAAATGGGTGTCTGAAATCAGAGTTCCAAAGACTGGACGACGAATATGGCTAGGTTCAT 
ACGATGCTCCAGAGAAGGC^GCTAGAGCCTATGATGCTGCTTTGTTCTGTATTAGGGGTG 
AGAAGGGAGTTTACAATTTTCCCACTGATAAAAAGCCGCAGCTTCCAGAAGGTTCTGTCC 
GGCCTCTGTCCAAGCTCGACATACAGACAATAGCAACAAACTATGCTTCATCAGTTGTGC 
ATGTACCTTCCCATGCCACCACACTCCCGGCAACAACCCAGGTTCCCTCTGAAGTTCCTG 
CTTCCTCTGATGTTTCTGCTTCTACTGAGATTACAGAGATGGTCGATGAATATTATCTCC 
CAACCGATGCAACTGCAGAATCAATATTCTCAGTTGAAGACTTACAACTGGACAGTTTCC 
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TCATGATGGACATTGATTGGATAAACAATCTAATCTGATGTGTAACGTCACTTGCAGTGA 
CATTTAATATGGTTTANCTATCAGTTACCTGTCTGCTTCTTGTAAGGGTATACTTGGATC 
CTTGTCTTTGAACTTGTTTTATTTAGCATGCAAA 

>G39 Amino Acid Sequence (domain in AA coordinates: 24-90) 
MPPSPPKSPFISSSLKGAHEDRKFKCYRGVRKRSWGKWVSEIRVPKTGRRIWLGSYDAPE 
KAARAYDAALFCIRGEKGVYNFPTDKKPQLPEGSVRPLSKLDIQTIATNYASSWHVPSH 
ATTLP ATTQVPS EVPAS SDVS ASTE ITEMVDE YYL PTD ATAES I FS VEDLQLD S FLMMD I 

DWINNLI* 

>G439 (128.. 967) 

TATAAATCTTCGTTTCTACTTTTTTTTCTTC 
AGGGCTTCTTCTCTTTGTTTCTCCAATCTTTATTAGT^ 

TATACAAATGGCAATGGCTTTAAACATGAATGCTTACGTAGACGAGTTC^TGGAAGCTCT 
TGAAC(^TTCATGAAGGTAACTTCATCTTCTTCTACTTCGAATTCATCAAATCOVAAACC 
ATTAACTCCTAATTTCATCCCTAATAATGACCAAGTCTTACCGGTATCTAACCAAACCGG 
TCCGATTGGGCTAAACCAGCTCACTCCAACACAAATCCTCCAAATTCAGACAGAGTTACA 
TCTCCGGCAAAACCAATCTCGTCGTCGCGCTGGTAGTCATCTTCTCACCGCTAAACCAAC 
CTCAATGAAGAAAATCGACGTAGCAACTAAACCGGTTAAACTATACCGAGGCGTAAGACA 
GAGGCAATGGGGTAAATGGGTAGCTGAGATTCGGCTACCTAAAAACCGAACCCGGTTATG 
GCTCGGTACGOTCGAAACGGCTCAAGAAGCTGCATTAGCTTACGATCAAGCAGCTCATA^ 
GATCAGAGGAGACAACGCTCGTCTCAATTTCCCAGACATTGTTCGTCAAGGACACTATAA 
ACAGATATTGTCTCCGTCTATCAACGCAAAGATCGAATCCATCTGCAATAGTTCTGATCT 
TCCACTGCCTCAGATCGAGAAAC^GAACAAAACAGAGGAGGTGCTCTCTGGTTTTTCCAA 
ACCGGAGAAAGAACCGGAATTTGGGGAGATATACGGATGCGGATACTCGGGCTCATCTCC 
TGAGTCGGATATAACGTTGTTGGATTTCTCAAGCGACTGTGTGAAAGAAGATGAGAGTTT 
CTTGATGGGTTTGCACAAGTATCCTTCTTTGGAGATTGATTGGGACGCTATAGAGAAACT 
CTTCTGAATCCATTTTATCTTTTTGATTCATTTGTCTCTAAATTGTAGAATTTTATTTTC 
AGAGCTTTGTAAGGGAAGTTCTTGAATGAGAGTTGCAGAGGACTAGTGGAACCTAACTCT 
GTTTTCTTTTGTAAGTATTGTTTATAATGGGCCGTTGAATGGGCCTTATTGATTTAAACA 
GCCCAAGTTTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G439 Amino Acid Sequence (domain in AA coordinates: 110-177) 
MAMALNMNAYVDEFMEALEPFMKV 

GIJNQLTPTQILQIQTEIiHLRQNQSRRRAGSHLIjTAKPTSMKKIDVATKPVKLYRGWQR 
WGKWVAE IRLPKNRTRIjWLGTFETAQEAALAYDQAAHKIRGDNARLNFPD ivrqghykqi 

lspsinakiesicnssdlplpqiekqnkteevlsgfskpekepefgeiygcgysgsspes 
ditlldfssdcvkedesflmglhkypsleidwdaieklf* 

>G470 (1..2580) 

ATGGCGAGTTCGGAGGTTTCAATGAAAGGTAATCGTGGAGGAGATAACTTCTCCTCCTCT 
GGTTTTAGTGACCCTAAGGAGACTAGAAATGTCTCCGTCGCCGGCGAGGGGCAAAAAAGT 
AATTCTACCCGATCCGCTGCGGCTGAGCGTGCTTTGGACCCTGAGGCTGCTCTTTACAGA 
GAGCTATGGCACGCTTGTGCTGGTCCGCTTGTGACGGTTCCTAGACAAGACGACCGAGTC 
TTCTATTTTCCTCAAGGACACATCGAGCAGGTGGAGGCTTCGACGAACCAGGCGGCAGAA 
CAACAGATGCCTCTCTATGATCTTCCGTCAAAGCTTCTCTGTCGAGTTATTAATGTAGAT 
TTAAAGGCAGAGGCAGATACAGATGAAGTTTATGCGCAGATTACTCTTCTTCCTGAGGCT 
AATCAAGACGAGAATGCAATTGAGAAAGAAGCGCCTCTTCCTCCACCTCCGAGGTTCCAG 
GTGCATTCGTTCTGCAAAACCTTGACTGCATCCGACACAAGTACACATGGTGGATTTTCT 
GTTCTTAGGCGACATGCGGATGAATGTCTCCCACCTCTGGATATGTCTCGACAGCCTCCC 
ACTCAAGAGTTAGTTGCAAAGGATTTGCATGCAAATGAGTGGCGATTCAGACATATATTC 
CGGGGTCAACCACG8AGGCATTTGCTACAGAGTGGGTGGAGTGTGTTTGTTAGCTCCAAA 
AGGCTAGTTGCAGGCGATGCGTTTATATTTCTAAGGGGCGAGAATGGAGAATTAAGAGTT 
GGTGTAAGGCGTGCGATGCGACAACAAGGAAACGTGCCGTCTTCTGTTATATCTAGCCAT 
AGCATGCATCTTGGAGTACTGGCCACCGCATGGCATGCCATTTCAACAGGGACTATGTTT 
ACAGTCTACTACAAACCCAGGACGAGCCCATCTGAGTTTATTGTTCCGTTCGATCAGTAT 
ATGGAGTCTGTTAAGAATAACTACTCTATTGGCATGAGATTCAAAATGAGATTTGAAGGC 
GAAGAGGCTCCTGAGCAGAGGTTTACTGGCACAATCGTTGGGATTGAAGAGTCTGATCCT 
ACTAGGTGGCCAAAATCAAAGTGGAGATCCCTCAAGGTGAGATGGGATGAGACTTCTAGT 
ATTCCTCGACCTGATAGAGTATCTCCGTGGAAAGTAGAGCCAGCTCTTGCTCCTCCTGCT 
TTGAGTCCTGTTCCAATGCCTAGGCCTAAGAGGCCCAGATCAAATATAGCACCTTCATCT 
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CCTGACTCTTCGATGCTTACCAGAGAAGGTACAACTAAGGCAAACATGGACCCTTTACCA 

GCAAGCGGACTTTCAAGGGTCTTGCAAGGTCAAGAATACTCGACCTTGAGGACGAAACAT 

ACTGAGAGTGTAGAGTGTGATGCTCCTGAGAATTCTGTTGTCTGGCAATCTTCAGCGGAT 

GATGATAAGGTTGACGTGGTTTCGGGTTCTAGAAGATATGGATCTGAGAACTGGATGTCC 

TCAGCCAGGCATGAACCTACTTACAGAGATTTGCTCTCCGGCTTTGGGACTAA(^ 

CCATCCCATGGTCAGCGGATACCTTTTTATGACCATTCATCATCACCTTCTATG 

AAGAGAATCTTGAGTGATTCAGAAGGCAAGTTCGATTATCTTGCTAACCAGTGGCAGATG 

ATACACTCTGGTCTCTCCCTGAAGTTACATGAATCTCCTAAGGTACCTGCAGCAACTGAT 

GCGTCTCTCCAAGGGCGATGCAATGTTAAATACAGCGAATATCCTGTTCTTAATGGTCTA 

TCGACTGAGAATGCTGGTGGTAACTGGCCAATACGTCCACGTGCTTTGAATTATTATGAG 

GAAGTGGTCAATGCTCAAGCGCAAGCTCAGGCTAGGGAGCAAGTAACAAAACAACCCTTC 

ACGATACAAGAGGAGACAGCAAAGTCAAGAGAAGGGAACTGCAGGCTCTTTGGCATTCCT 

CTGACCAACAACATGAATGGGACAGACTCAACCATGTCTCAGAGAAACAACTTGAATGAT 

GCTGCGGGGCTTACACAGATAGCATCACCAAAGGTTCAGGACCTTTCAGATCAGTCAAAA 

GGGTCAAAATCAACAAACGATCATCGTGAACAGGGAAGACCATTCCAGACTAATAATCCT 

CATCCGAAGGATGCTCAAACGAAAACCAACTCAAGTAGGAGTTGCACAAAGGTTCACAAG 

CAGGGAATTGCACTTGGCCGTTCAGTGGATCTTTCAAAGTTCCAAAACTATGAGGAGTTA 

GTCGCTGAGCTGGACAGGCTGTTTGAGTTCAATGGAGAGTTGATGGCTCCTAAGAAAGAT 

TGGTTGATAGTTTAC^CAGATGAAGAGAATGATATGATGCTTGTTGGTGACGATCCTTGG 

(^GGAGTTTTGTTGCATGGTTCGCAAAATCrTCATATACACGAAAGAGGAAGTGAGGAAG 

ATGAACCCGGGGACTTTAAGCTGTAGGAGCGAGGAAGAAGCAGTTGTTGGGGAAGGATCA 

GATGCAAAGGACGCCAAGTCTGCATCAAATCCTTCATTGTCCAGCGCTGGGAACT 

>G470 Amino Acid Sequence (domain in AA coordinates: 61-393) 

MASSEVSMKGNRGGDNFSSSGFSDPKETRNVSVAGEGQKSNSTRSAAAERALDPEAALYR 

ELWHACAGPIiVTVPRQDDRVFYFPQGHIEQVEASTNQAAEQQMPLYDLPSKLLCRVINVD 

LKAEADTDEVYAQITLLPEANQDENAIEKEAPLPPPPRFQVHSFCKTLTASDTSTHGGFS 

VLRRHADECLPPLDMSRQPPTQELVAIODLHANEWRFRHIFRGQPRRHLLQSGWSVFVSSK 

RLVAGDAF I FLRGENGELRVGVRRAMRQQGNVPS S VIS SHSMHLGVLATAWHAI STGTMF 

TVYYKPRTSPSEFIVPFDQYMESVKNl^SIGMRFKMRFEGEEAPEQRFTGTIVGIEESDP 

TRWPKSKWRSLK\raWDETSSIPRPDRVSPWKVEPAI^ 

PDSSMLTREGTTKAKnYnDPLPASGLSRVLQ^ 

DDKVDWSGSRRYGSENWMSSARHEPTYTDLLSGFGTNIDPSHGQRIPFYDHSSSPSMPA 
KRILSDSEGKFDYLANQWQMIHSGLSLKLHESPKVPAATDASLQGRClT\nCYSEYPVLNGL 
STENAGGNWPIRPRALNYYEEWNAQAQAQAREQVTKQPFTIQEETAKSREGNCRIiFGIP 
LTNNMNGTDSTMSQRNNLNDAAGLTQIASP 

HPKDAQTKTNSSRSCTKVHKQGIALGRSVDLSKFQNYEELVAELDRLFEFNGELMAP 
WLIVYTDEEOTMMLVGDDPWQEFCCMWKIFIYTKEEVRKMNPGTLSCRSEEEAW 
DAKDAKS ASNPSLSSAGNS * 
>G652 (1..606) 

atgagcggaggaggagacgtgaacatgagtggtggagacagacgcaagggaacggtgaag 
tggtttgatacacagaaggggtttggtttcatcacacctagcgacggtggtgacgatctc 
ttcgttcaccagtcttccatcagatctgaaggatttcgtagcctcgcagctgaggaatct 
gttgagttcgacgttgaggttgacaactccggccgtcccaaggctattgaagtgtctgga 
cccgacggtgctcccgttcagggtaacagcggtggtggtggttcatctggtggacgcggt 
ggttttggcggcggtggtggaagaggagggggacgtggtggaggaagctacggaggaggt 
tatggtggaagaggaagcggtggccgtggaggaggtggtggtgataattcttgctttaag 
tgcggtgaaccaggtcacatggcgagagaatgctctcaaggtggtggaggatacagcgga 
ggcgggggtggtggaaggtacgggtctggcggcggcggaggaggaggtggtggtggctta 
agctgctacagctgtggagagtctgggcactttgcaagggattgcactagcggtggtgct 
cgttga 

>G652 Amino Acid Sequence (domain in AA coordinates : 2 8-49, 137-151, 182-196) 

MSGGGDV^SGGDRRKGTVKWFDTQKGFGFITPSDGGDDLFVHQSSIRSEGFRSLAAEES 

VEFDVEVDNSGRPKAIEVSGPDGAPVQGNSGGGGSSGGRGGFGGGGGRGGGRGGGSYGGG 

YGGRGSGGRGGGGGDNSCFKCGEPGHMARECSQGGGGYSGGGGGGRYGSGGGGGGGGGGL 

SCYSCGESGHFARDCTSGGAR* 

>G671 (61.. 1119) 

TTCACTTGAGAACAACCCCCTTTGAACTCGATCAAGAAAGCTAAGTTTGAAGAATCAAGA 
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ATGGTGCGGACACCGTGTTGCAAAGCCGAACTAGGGTTAAAGAAAGGAGCTTGGACTCCC 
GAGGAAGATCAGAAGCTTCTCTCTTACCTTAACCGCCACGGTGAAGGTGGATGGCGAACT 
CTCCCCGAAAAAGCTGGACTCAAGAGATGCGGCAAAAGCTGCAGACTGAGATGGGCCAAT 
TATCTTAGACCTGACATCAT^AAGAGGAGAGTTCACTGAAGACGAAGAACGTTCAATCATC 
TCTCTTCACGCCCTTCACGGCAACAAATGGTCTGCTATAGCTCGTGGACTACCAGGAAGA 
ACCGATAACGAGATCAAGAACTACTGGAACACTCATATCAAAAAACGTTTGATCAAGAAA 
GGTATTGATCCAGTTACACACAAGGGCATAACCTCCGGTACCGACAAATCAGAAAACCTC 
CCGGAGAAACAAAATGTTAATCTGACAACTAGTGACCATGATCTTGATAATGACAAGGCG 
AAGAAGAACAACAAGAATTTTGGATTATCATCGGCTAGTTTCTTGAAC^^GTAGCTAAT 
AGGTTCGGAAAGAGAATCAATCAGAGTGTTCTGTCTGAGATTATCGGAAGTGGAGGCCCA 
CTTGCTTCTACTAGTCACACTACTAATACTACAACTACAAGTGTTTCCGTTGACTCTGAA 
TCAGTTAAGTCAACGAGTTCTTCCTTCGCACCAACCTCGAATCTTCTCTGCCATGGGACC 
GTTGCAACAACACCAGTTTCATCGAACTTTGACGTTGATGGTAACGTTAATCTGACGTGT 
TCTTCGTCCACGTTCTCTGATTCCTCCGTTAACAATCCTCTAATGTACTGCGATAATTTC 
GTTGGTAATAACAACGTTGATGATGAGGATACTATCGGGTTCTCCACATTTCTGAATGAT 
GAAGATTTCATGATGTTGGAGGAGTCTTGTGTTGAAAACACTGCGTTCATGAAAGAACTT 
ACGAGGTTTCTTCACGAGGATGAAAACGACGTCGTTGATGTGACGCCGGTCTATGAACGT 
CAAGACTTGTTTGACGAAATTGATAACTATTTTGGATGAGTGAAACTCATAATCGATGAA 
TCCCACGTGACCATGTCAATATGATGTCTATGGATATGTTACCTTGATGATGTTGATGGT 
AATAATAATAAATAATAGATGGTGATGATGACCATGCATGAATCATGAATGTAGTTCGTG 
TTGTCACATATGCTTGTGTTTTTGTGTT^ 

TGTAAATGGATTATAAATGGTGATGTAATAATTATAATGTTAAAAAAAAAAAAAAAAAAA 
AAAA 

>G671 Amino Acid Sequence (domain in AA coordinates: 15-115) 
MVRTPCCKAELGLKKGAWTPEEDQKLLSYM 

YLRPDI KRGEFTEDEERS 1 1 SLHALHGNKWS AIARGLPGRTDNE I KNYWNTHI KKRLI KK 
GIDPVTHKGITSGTDKSENLPEKQNVNLTTSDHDLDNDKAKKNNKNFGLSSA^ 
RFGKRINQSVLSEIIGSGGPLASTSHTTNTTTTSVSVDSESVKSTSSSFAPTSNLLCHGT 
VATTPVSSNFDVDGNVNIiTCS S STFSDS S VNNPLMYCDNFVGNNNVDDEDT IGFSTFLND 
EDFMMLEESCVENTAFMKELTRFLHEDENDVVDVTPVYERQDLFDE IDNYFG* 
>G779 (110.. 712) 

GACATGC^TGTAAGCATTCGGTTAATTAATCGAGTCAAAGATATATATCAGTAT^ATACAT 

ATGTGTATATTTCTGGAAAAAGAATATATATATTGAGAAATAAGAAAAGATGAAAATGGA 

AAATGGTATGTATAAAAAGAAAGGAGTGTGCGACTCTTGTGTCTCGTCCAAAAGCAGATC 

CAACCACAGCCCCAAAAGAAGCATGATGGAGCCTCAGCCTCACCATCTCCTCATGGATTG 

GAACAAAGCTAATGATCTTCTCACACAAGAACACGCAGCTTTTCTCAATGATCCTCACCA 

TCTCATGTTAGATCCACCTCCCGAAACCCTAATTCACTTGGACGAAGACGAAGAGTACGA 

TGAAGACATGGATGCGATGAAGGAGATGCAGTACATGATCGCCGTCATGCAGCCCGTAGA 

CATCGACCCTGCCACGGTCCCTAAGCCGAACCGCCGTAACGTAAGGATAAGCGACGATCC 

TCAGACGGTGGTTGCTCGTCGGCGTCGGGAAAGGATCAGCGAGAAGATCCGAATTCTCAA 

GAGGATCGTGCCTGGTGGTGTOAAGATGGACACAGCTTCCATGCTCGACGAAGCCATACG 

TTACACCAAGTTCTTGAAACGGC^GGTGAGGATTCTTCAGCCTCACTCTCAGATTGGAGC 

TCCTATGGCTAACCCCTCTTACCTTTGTTATTACCACAACTCCCAACCCTGATGAACTAC 

ACAGAAGCTCGCTAGCTAGACATTTGGTGTCATCCTCTCAACCTTT 

>G779 Amino Acid Sequence (domain in AA coordinates: 126-182) 

MKMENGMY KKKGVCDS CVS S KSRSNHS PKRSMMEPQPHHLLMD WNKANDLLTQEHAAFLN 

DPHHLMLDPPPETLIHLDEDEEYDEDiynDAMKEMQYMIAVMQPVDIDPATVPK^ 

SDDPQTWARRRRERISEKIRILKRIVPGGAKMD^^ 

QIGAPMANPSYLCYYHNSQP* 

>G962 (148.. 1392) 

CGTCGACTCTCTACTCAACACCACTCAATTTCATCTCTCTTTTTCCCTTCCATTGTTAGT 
ATAAAAACCAAGCAAACCCTTAATCACTTTTCATCATCATATATCACCTTAATCCACATG 
CATACACATATCTAGTCTTTTTGATATATGGCAATTGTATCCTCCACAACAAGCATCATT 
CCCATGAGTAACCAAGTCAACAATAACGAAAAAGGTATAGAAGACAATGATCATAGAGGC 
GGCCAAGAGAGTCATGTCCAAAATGAAGATGAAGCTGATGATCATGATCATGACATGGTC 
ATGCCCGGATTTAGATTCCATCCTACCGAAGAAGAACTCATAGAGTTTTACCTTCGCCGA 
AAAGTTGAAGGCAAACGCTTTAATGTAGAACTCATCACTTTCCTCGATCTTTATCGCTAT 
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GATCCTTGGGAACTTCCTGCTATGGCGGCGATAGGAGAGAAAGAGTGGTACTTCTATGTG 
CCAAGAGATCGGAAATATAGAAATGGAGATAGACCGAACCGAGTAACGACTTCAGGATAT 
TGGAAAGCCACCGGAGCTGATAGGATGATCAGATCGGAGACTTCTCGGCCTATCGGATTA 
AAGAAAACCCTAG1TTTCTACTCTGGTAAAGCCCCTAAAGGCACTCGTACTAGTTGGATC 
ATGAACGAGTATCGTCTTCCGCACCATGAAACCGAGAAGTACCAAAAGGCTGAAATATCA 
TTGTGCCGAGTGTACAAAAGGCCAGGAGTAGAAGATCATCCATCGGTACCACGTTCTCTC 
TCCACAAGACATCATAACCATAACTCATCGACATCATCCCGTTTAGCCTTAAGACAACAA 
CAACACCATTCATCCTCCTCTAAT(^TTCCGA^ 

AACAATCTCGAGAAGCTCTCCACCGAATATTCCGGCGACGGCAGCACAACAACAACGACC 
AC^VAACAGTAACTCTGACGTTACCATTGCTCTAGCCAATCAAAACATATATCGTCCAATG 
CCTTACGACACAAGCAACAACACATTGATAGTCT 

GAAACTGCCATTGTTGACGATCTTCAAAGACTAGTTAACTACCAAATATCAGATGGAGGT 
AACATCAATGACCAATACTTTCAAATTGCT 

GCTAACGCAAACGCATTACAATTGGTGGCTGCGGCGACTACAGCGACAACGCTAATGCCT 
CAAACTGAAGCGGCGTTAGCTATGAACATGATTCCTGCA^ 

TTGTGGGATATGTGGAATCCAATAGTACCAGATGGAAACAGAGATCACTATACTAATATT 
CCTTTTAAGTAATTTAATTAGATC^TGATTATTATCCATGAC^TAATTAATGCTGCTTT 

GCGC 

>G962 Amino Acid Sequence (domain in AA coordinates: 53-175) 
MAIVSSTTS I IPMSNQVNNNEKGIED1TOHRGGQESHVQNEDEADDHDHDMVMPGFRFHPT 
EEELIEFYLRRKVEGKRFNVELITFLDLYRYDPWELPA>^ 

DRPNRVTTSGYWKATGADRMIRSETSRPIGLKKTLWYSGKAPKGTRTSWIMNEYRLPHH 
ETEKYQKAEISLCRVYKRPGVEDHPSVPRSLSTRlffll^SSTSSRIiALRQQQHHSSSSra 
SDimLmmDmiNNLEKLSTEYSGDGSTTTTTTNSNSDVT 

IVSTRNHQDDDETAIVDDLQRLVNYQISDGGNINHQYFQIAQQFHHTQQQNANANALQLV 
AAATTATTLMPQTQAALAMNMI PAGTI PNNALWDMWJJP I VPDGNRDHYTNI PFK* 
>G977 (46.. 591) 

CACCAAACTCACCTGAAACCCTATTTCCATTTACCATTCAC^CTAATGGCACGACCACAA 
CAACGCTTTCGAGGCGTTAGACAGAGGCATTGGGGCTCTTGGGTCTCCGAAATTCGTCAC 
CCTCTCTTGAAAACAAGAATCTGGCTAGGGACGTTTGAGACAGCGGAGGATGCAGCAAGG 
GCCTACGACGAGGCGGCTAGGCTAATGTGTGGCCCGAGAGCTCGTACTAATTTCCCATAC 
AACCCTAATGCCATTCCTACTTCCTCTTCCAAGCTTCTATCAGCAACTCTTACCGCTAAA 
CTCCACAAATGCTACATGGCTTCTCTTCAAATC 

ACGCAGACCGCAAGATCACAATCCGCGGACAGTGACGGTGTGACGGCTAACGAAAGTCAT 

TTGAACAGAGGAGTAACGGAGACGACAGAGATCAAGTGGGAAGATGGAAATGCGAATATG 

CAACAGAATTTTAGGCCATTGGAGGAAGATCATATCGAGCAAATGATTGAGGAGCTGCTT 

CACTACGGTTCCATTGAGCTTTGCTCTGTTTTACCAACTCAGACGCTGTGAGAAATGGCC 

TTGTCGTTTTAGCGTATTCTTTTCATTTTTATTTTTGTTTCCACAAAAACGGCGTCGTAA 

GTGATGAGAGTAGTAGTGAGAGAAGGCTAATTTCAAGACATTTTGATCTGAATTGGCCTC 

TTTTGAAACACTGATTCTAGTTTCTATAAGAGCAATCGATCATATGCTATGTTATGTATA 

GTATTATAAAAAAATGTTATTTTCTGATTNAAAAAAAAAAAAAAAAAAAAAAA 

>G977 Amino Acid Sequence (domain in AA coordinates: 5-72) 

MARPQQRFRGWQRHWGSWVSEIRHPLLKTRIWLGTFETAEDAARAYDEAARLMCGPRAR 

TNFPYNPNAIPTS SS KLLSATLTAKLHKCYMASLQMTKQTQTQTQTQTARSQSADSDGVT 

ANE SHLNRGVTETTE IKWEDGNANMQQNFRPLEEDHIEQMI EELLHYGS I ELCS VLPTQT 

L* 

>G1063 (241.. 966) 

GTTAAAGAAGATGGATGGGCCACAAGTTGCTATATAAATCCTTCCACTTCTTGTTGTATA 
CTATTGCTTGAGTTCTGATTGGGCACAGTAGTACCATTGCCATTTCTCTCACACATACCG 
TCTCTTTCTCTCATCATCAATCATCAATCATCCAAAAGAAAAAACCCTAAAATTTCACTT 
GTAAGCTTTTCACCAGTTTCTCTCC^TACCCATTTTATCAGCTTCTCCATATCTTTCTCT 
ATGGATTCTGACATAATGAACATGATGATGCATCAGATGGAGAAGCTTCCTGAGTTTTGT 
AACCCTAATTCCTCTTTCTTCTCTCCCGACCACAACAACACTTACCCTTTTCTCTTTAAC 
TCCACTCATTACCAGTCCGATCACTCAATGACCAACGAACCAGGTTTCCGCTACGGTTCC 
GGTTTACTCACTAACCCTTCTTCTATCTCTCCCAACACAGCTTACTCTTCCGTTTTTCTT 
GACAAAAGAAACAACAGTAACAACAACAATAATGGCACGAACATGGCAGCTATGCGAGAG 
ATGATCTTCCGTATCGCCGTGATGCAACCGATCCATATCGATCCCGAGGCGGTTAAGCCA 
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CCGAAGAGGAGGAACGTCAGGATCTCTAAAGATCCTCAAAGCGTGGCGGCTAGGCATAGA 

AGGGAG AG AAT AAGCG AGAGG ATTCGG ATTTTG CAACGGCTTGTTC CTGGTGGGACG AAG 

ATGGATACAGCTTCGATGCTCGATGAAGCAATTCATTATGTGAAGTTm 

GTGCAGTCTCTGGAGGAGCAGGCGGTGGTTACTGGCGGAGGGGGAGGAGGAGGAGGAAGG 

GTTTTGATCGGTGGAGGTGGAATGACGGCGGCGAGTGGTGGTGGTGGCGGCGGGGGAGTG 

GTTATGAAAGGGTGTGGAACAGTGGGGACTCATGAGATGGTGGGCAATGCACAGATTCTT 

AGATGATGATGATTTTTAATTTTATTATTATTATATTAATGTTGGAGAAAAAGAGAAAAA 

TGATTCTGGAGAGGGAAGCCAAGTAATTTATGTGAGAGTCTTTAATTTAACTTTATTTTC 

TTGTTTAGATAATGTGTAATGATGGTTTTTAAAGCCAAAGACTCTCCATGGTTGTTGGAG 

CGAGTTTG 

>G1063 Amino Acid Sequence (domain in aa coordinates: 131-182; 

MDSDIMNMMMHQMEKLPEFCNPNSSFFSPDHNOTYPF 

GLLTNPSSISPNTAYSSVFLDKRNNSNNN1^GTNT4AAMREMIFRI 

PKRRtTVRISKDPQSVAARHRRERISERIRILQRLVPGGTKMDTASMLDEAIHW 

VQSLEEQAWTGGGGGGGGRVL IGGGGMTAAS GGGGGGGWMKGCGTVGTHQMVGNAQ IL 

R* 

>G1140 (67.. 729) 

ATCCAAGATCCTCCAACTCACAGAAAGGCAGATTCAAGAACAGTAGTGAAGGAGAGATCT 

GGTAAAATGGCGAGAGAGAAGATAAGGATAAAGAAGATTGATAACATAACAGCGAGACAA 

GTTACTTTCTCAAAGAGAAGAAGAGGAATCTTCAAGAAAGCCGATGAACTTTCAGTTCTT 

TGCGATGCTGATGTTGCTCTCATCATCTTCTCTGCCACCGGAAAGCTCTTCGAGTTCTCC 

AGCTCAAGAATGAGAGACATATTGGGAAGGTATAGTCTTCATGCAAGTAACATCAACAAA 

TTGATGGATCCACCTTCTACTCATCTCCGGCTTGAGAATTGTAACCTCTCCAGACTAAGT 

AAGGAAGTCGAAGACAAAACCAAGCAGCTACGGAAACTGAGAGGAGAGGATCTTGATGGA 

TTGAACTTAGAAGAGTTGCAGCGGCTGGAGAAACTACTTGAATCCGGACTTAGCCGTGTG 

TCTGAAAAGAAGGGCGAGTGTGTGATGAGCCAAATTTTCTCACTTGAGAAACGGGGATCG 

GAATTGGTGGATGAGAATAAGAGACTGAGGGATAAACTAGAGACGTTGGAAAGGGCAAAA 

CTGACGACGCTTAAAGAGGCTTTGGAGACAGAGTCGGTGACCACAAATGTGTCAAGCTAC 

GACAGTGGAACTCCCCTTGAGGATGACTCCGACACTTCCCTGAAGCTTGGGCTTCCATCT 

TGGGAATGAATCTGAGAGAGAGAAAGATCCAGCAGAGTTGACTTCGATGGAAGCCCACAA 

ATATTAAGTCTACCTTTTCCCTTTCTTTTCTTTGAATAAGTGTTGAAAAAGAATTGAGAT 

GGGAAGGATGAATTCTCATTGCATTGCAGAGAAGCAAGTTTCAGATATTGTACGTGTTAT 

TGGGTCTTTATAACTATTTTTCTCCCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G1140 Amino Acid Sequence (conserved domain in AA coordinates : 2 -57) 

MAREKIRIKKIDNITARQVTFS KRRRGIFKKADELS VLCDADVALI IFSATGKLFEFS SS 

RMRDILGRYSLHASNINKLITOPPSTHLRLENCNL^ 

LEELQRLEKLLESGLSRVSEKKGECVMSQIFSLEKRGSELVDENKRLRDKLETLERAKLT 

TLKEALETESVTTNVSSYDSGTPLEDDSDTSLKLGLPSWE* 

>G1425 (43.. 1005) 

ACTCTCTCAAACCATAAAAAATATTCTCCGATCATCATTTTAATGGAGAGTACAGATTCT 
TCCGGTGGTCCTCCGCCGCCGCAACCAAACCTCCCTCCAGGATTCCGGTTTCATCCAACA 
GACGAAGAACTTGTAATTCATTACCTCAAACGCAAAGCAGATTCTGTTCCTTTACCAGTC 
GCGATCATCGCCGACGTTGATCTTTACAAATTTGATCCATGGGAACTTCCCGCGAAAGCT 
TCGTTTGGAGAACAAGAATGGTATTTTTTCAGTCCAAGAGATCGGAAATATCCCAACGGA 
GCTAGACCTAACCGAGCTGCGACTTCCGGTTATTGGAAAGCGACTGGTACAGATAAACCG 
GTGATTTCAACCGGCGGTGGTGGTAGTAAAAAAGTGGGAGTTAAAAAGGCTCTAGTGTTT 
TACAGTGGTAAACCACCAAAAGGAGTTAAATCAGATTGGATTATGCATGAATATCGGTTA 

ACTGATAATAAACCIACTCACATTTGTC 

GATGATTGGGTGTTGTGTCGTATCTACAAGAAAAACAATAGTACAGCATCTAGACATCAT 
CATCATCTTCATCATATTCATCTAGATAATGATCATCATCGTCATGATATGATGATTGAT 
GATGATCGATTCCGTCATGTTCCTCCTGGTCTTCACTTCCCGGCGATTTTTTCTGACAAT 
AATGATCCGACGGCTATATATGATGGTGGCGGCGGCGGATACGGAGGTGGAAGTTACTCG 
ATGAATCATTGTTTCGCATCTGGATCAAAGCAGGAGCAGTTGTTTCCACCGGTGATGATG 
ATGACTAGTCTAAATCAAGATTCCGGTATTGGATCGTCGTCGTCACCTAGCAAGAGATTT 
AACGGCGGCGGCGTTGGAGATTGTTCGACTTCTATGGCGGCGACGCCGTTAATGCAGAAC 
CAAGGTGGGATTTACCAATTGCCTGGTTTGAATTGGTATTCTTGAAAACAATTTACGATG 
AAGAATTTTTAAAATTTGTGTATATATATACGGTTTGAGTGATTAGGGGGCATTGGGGGA 
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TTTATTTACGGTTGATTATTATTGTAGTGTTATAGAACTAAGGAGATTAAATTAAATAGA 
TTGGAGGAAAAAAAAAAAAAAAAA 

>G1425 Amino Acid Sequence (domain in AA coordinates: 20-173) 
MESTDSSGGPPPPQPNLPPGFRFHPTDEELVIHYLKRKADSVPLPVAIIADVDLYKFDPW 
ELPAKAS FGEQEWYFFSPRDRKYPNGARPNRAATSGYVJKATGTDKPVI STGGGGSKKVGV 
KKALVFYSGKPPKGVKSDWIMHEYRLTDNKPTO^ 
TASRHHHHLHHIHLDNDHHRHDMMIDD^ 

GGGSYSl#raCFASGSKQEQLFPPVMMMTSLNQDSGIGSSSSPSKRFNGGGVGDCSTSMAA 

TPLMQNQGGI YQLPGLNWYS * 
>G1449 (105.. 581) 

TAGACAGAGAGAAATAGAAATAGAGAGAGAGAGACATGAAGAGCACTCTCAATAGAGAAG 
AGAAGGAAGCATGAAGCTAGCTCTGCAGCTTCAAGGTCTCATTAATGGAGGTCTCTAACT 
CTTGTTCTTCATTTTCTTCATCCTCTGTCGAC^ 
CTGTTAATCTCTCCCTTAGTCTCACATTTCC 

i^GATTGGCCACCGATAAAGTCTAGATTAAGAGATACACTAAAGGGTCGTCGTCTTCTTC 
GTCGTGGTGATGACACTTCTCTCTTTGTTAAGGTTTATATGGAAGGTGTTCCCATTGGAA 
GAAAACTCGACCTTTG03TATTCTCAGGCTACGAGAGTCTATTAGAAAATCTCTCTCACA 
TGTTCGATACTTCAATCATCTGCGGTAATCGAGATCGAAA^ 

AAGACAAGGATGGAGATTGGATGATGGTCGGAGATATTCCATGGGATATGTTTCTTGAAA 
CCGTGAGAAGACTAAAGATCACGAGACCGGAGAGGTATTAAAACTTGGATCGGTCAAGGC 
TGTGATTGCGCAGTTACGAGACGTGTAAGATTTAGGCATTGATGAAGAGACTTGAGGCGG 
GACGGAGCTATTGCTGCATATTGCAACAAAGGCCTTGAAGAAGTTGGAGAATTGATTGAT 
GCATATATTTATTTATATGACACCTTTGAGTGTGTTTTTTCTTATAAATAJ^TCACAATA 

TCCAAGACTTCTCTTTAAA 

>G1449 Amino Acid Sequence (domain in AA coordinates: 48-53,74-107,122- 
MEVSNSCSSFSSSSVDSTKPSPSESSVNLSLSLTFPSTSPQREARQDWPPIKSRLRDTLK 
GRRLLRRGDDTSLFVKVYMEGVPIGRKLDLCVFSGYESLLENLSHMFDTS I ICGNRDRKH 
HVLTYEDKDGDWMMVGD I PWDMFLETVRRLKI TRPERY * 
>G1897 (1..678) 

ATGCCTTCTGAATTCAGTGAATCTCGTCGGGTTCCTAAGATTCCCCACGGCCAAGGAGGA 
TCTGTTGCGATTCCGACGGATCAACAAGAGCAGCTTTCTTGTCCTCGCTGTGAATCAACC 
AACACCAAGTTCTGTTACTACAACAACTACAACTTCTCACAACCTCGTCATTTCTGCAAG 
TCTTGTCGCCGTTACTGGACTCATGGAGGTACTCTCCGTGACATTCCCGTCGGTGGTGTT 
TCCCGTAAAAGCTCAAAACGTTCCCGGACTTATTCCTCTGCCGCTACCACCTCCGTTGTC 
GGAAGCCGGAACTTTCCCTTACAAGCTACGCCTGTTCTTTTCCCTCAGTCGTCTTCCAAC 
GGCGGTATCACGACGGCGAAGGGAAGTGCTTCGTCGTTCTATGGCGGTTTCAGCTCTTTG 
ATCAACTACAACGCCGCCGTGAGCAGAAATGGGCCTGGTGGCGGGTTTAATGGGCCAGAT 
GCTTTTGGTCTTGGGCTTGGTCACGGGTCGTATTATGAGGACGTCAGATATGGGCAAGGA 
ATAACGGTCTGGCCGTTTTCAAGTGGCGCTACTGATGCTGCAACTACTACAAGCCACATT 
GCTCAAATACCCGCCACGTGGCAGTTTGAAGGTCAAGAGAGCAAAGTCGGGTTCGTGTCT 

GGAGACTACGTAGCGTGA 

>G1897 Amino Acid Sequence (domain in AA coordinates :34-62) 
MPSEFSESRRVPKIPHGQGGSVAIPTDQQEQLSCPRCESTNTKFCYYNNY1JFSQPRHFCK 
SCRRYWTHGGTLRD I PVGGVSRKS SKRSRTYS SAATTS WGSRNFPLQATPVLFP QSS SN 
GGITTAKGSASS FYGGFS SLINYNAAVSRNGPGGGFNGPDAFGLGLGHGS YYEDVRYGQG 
ITVWPFSSGATDAATTTSHIAQIPATWQFEGQESKVGFVSGDYVA* 
>G2143 (89.-784) 

TCTTCTTCTTCCTCCATACCTTATCTCACCAGCTTCTCCATATCTCTCAAAGAAAAAACA 
AACCCTATAAATTCCACAAAAAAGGAGGATGGATAACTCCGACATTCTAATGAACATGAT 
GATGCAGCAGATGGAGAAGCTTCCTGAACACTTCTCTAACTCAAACCCTAACCCTAATCC 
CCATAACATTATGATGCTTTCTGAATCCAACACCCACCCGTTCTTCTTCAACCCCACTCA 
TTCTCATCTCCCATTTGACCAAACCATGCCTCACCACCAACCCGGTTTAAATTTCCGGTA 
CGCCCCCTCCCCGTCATCATCTCTCCCGGAGAAGAGAGGAGGCTGCAGCGACAACGCCAA 
CATGGCGGCGATGAGAGAGATGATCTTTCGAATAGCCGTGATGCAGCCTATACATATTGA 
TCCGGAATCCGTAAAGCCACCAAAGAGAAAGAACGTGAGGATCTCTAAGGATCCACAGAG 
CGTGGCAGCTCGGCATCGAAGGGAGAGGATAAGCGAGCGGATTCGGATTCTTCAGCGGCT 
TGTTCCCGGTGGGACTAAGATGGATACGGCGTCGATGCTCGATGAGGCTATCCATTACGT 
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TAAGTTTCTCAAGAAGCAAGTGCAGTCGCTGGAGGAACATGCGGTGGTTAACGGCGGAGG 

AATGACGGCGGTGGCCGGAGGAGCACTTGCGGGTACTGTTGGTGGAGGATATGGAGGAAA 

AGGGTGTGGCATTATGCGGTCTGATCATCACCAGATGCTTGGAAATGCACAGATTCTTAG 

ATGATGATGATGTTGATTTTTAAATATATATCATATGTTTATTAATATGACGGGAAAAAA 

TATTATCGAGGGAGTTGAATTTAGTATCATGAAACTATGAGAGC^TTTTrT 

TTTATCTTTCCGGGTTTCGATAATGTTTGGGATGGT^ 

AACTTGGTTGTAAAGACTAAAGAATAAGCATAGTTTATCAATTTATCATTACTAAATGAA 
ATAG 

>G2143 Amino Acid Sequence (domain in aa coordinates: 128-179) 
MDNSDILMNMJ^QQMEKLPEHFSNSNPNPNPHNIM 

PHHQPGLNFRYAPSPSSSLPEKRGGCSDNANMAAMREMIFRIAVMQPIHIDPESVKPPKR 
KNTOISKDPQSVAARHRRERISERIRILQRLVPGGTKI^T^ 
LEEHAVVNGGGMTAVAGGAIiAGTVGGGYGGKGCGIMRSDHHQMLGNAQILR* 
>G2535 (1..1005) 

ATGAACATATCAGTAAACGGACAGTCACAAGTACCTCCTGGCTTTAGGTTTCACCCAACC 
GAGGAAGAGCTCTTGAAGTATTACCTCCGCAAGAAAATCTCTAACATCAAGATCGATCTC 
GATGTTATTCCTGACATTGATCTCAACAAGCTCGAGC^ 

AAGATTGGAACGACGCCGCAAAACGATTGGTACTTTtATAGCCATAAGGAC^GAAGTAT 
CCCACCGGGACTAGAACCAACAGAGCCACCACGGTCGGATTTTGGAAAGCGACGGGACGT 
GACAAGACCATATATACCAATGGTGATAGAATCGGGATGCGAAAGACGCTTGTCTTCTAC 
AAAGGTCGAGCCCCTCATGGTCAGAAATCCGATTGGATCATGCACGAATATAGACTCGAC 
GAGAGTGTATTAATCTCCTCGTGTGGCGATCATGACGTCAACGTAGAAACGTGTGATGTC 
ATAGGAAGTGACGAAGGATGGGTGGTGTGTCGTGTTTTCAAGAAAAATAACCTTTGCAAA 
AACATGATTAGTAGTAGCCCGGCGAGTTCGGTGAAAACGCCGTCGTTCAATGAGGAGACT 
ATCGAGCAACTTCTCGAAGTTATGGGGCAATCTTGTAAAGGAGAGATAGTTTTAGACCCT 
TTCTTAAAACTCCCTAACCTCGAATGCCATAACAACACCACCATCACGAGTTATCAGTGG 
TTAATCGACGACC^GTCAAC^CTGCCACGTCTVGCAAAGTTATGGATCCCAGCTTCATC 
ACTAGCTGGGCCGCTTTGGATCGGCTCGTTGCCTCACAGTTAAATGGGCCCAACTCGTAT 
TCAATACCAGCCGTTAATGAGACTTCACAATCACCGTATCATGGACTGAACCGGTCCGGT 
TGTAATACCGGTTTAACACCAGATTACTATATACCGGAGATTGATTTATGGAACGAGGCA 
GATTTCGCGAGAACGACATGCCACTTGTTGAACGGTAGTGGATAA 

>G2535 Amino Acid Sequence (conserved domain in AA coordinates : 11-114) 
MNISVNGQSQVPPGFRFHPTEEBLLKYYLRKKISN 

KIGTTPQNDWYFYSHKDKKYPTGTRTNRATTVGFWKATGRDKTIYTNGDRIGMRKT^ 
KGRAPHGQKSDWIMHEYRLDESVTjISSCGDHDVT^ 

NMISSSPASSVlCrPSFNEETIEQLLEvWGQSCKGEIVLDPFLKLPNLECHNWTTITSYQW 

liddqvl^chvskvmdpsfitswaald^^ 
cntgltpdyyipeidlwneadfarttchllngsg* 

>G2557 (94.. 1215) 

gccgttattacaacgaggattgtgtttgatccgatggaaggattggaatctgtgtacgct 
caagctatgtatggaatgacacgagagagcaaaatcatggagc^tcaaggatcagatttg 
atttggggaggaaatgagctaatggctcgagaactctgttcttcttcttcttatcaccac 
caactcattaatccgaatcttagcagctgtttcatgtctgatcttggagtcttaggtgag 
attcaacagcagcaacatgttggcaacagagctagctcgatagatccatcatcactcgat 
tgtttgttatctgcgacgtcgaatagcaacaacacctcgacggaggacgatgaaggaata 
tctgtgcttttctcagattgtcagactctttggagctttggtggagtctcatctgcagag 
tctgagaacagagagatcactactgagacgacaacaacgataaagcctaagcctttgaag 
agaaacagaggaggagatggaggaactactgagactac^caacaacaacaaaacctaag 
tctttgaagagaaac^gaggagacgagac^ggaagtcactttagtcttgttcatcctc^ 
gatgattcggagaaaggaggtttcaagcttatatacgatgagaatcaatcgaaatcaaag 
aaaccaagaacagagaaagaacgaggcggttcttcgaacattagtttccaacattcaact 
tgtttgtctgacaatgtcgagcccgatgctgaggcgattgcacaaatgaaggagatgata 
tacagagcggctgcatttagaccggtgaatttcgggttagagattgtggagaagcctaag 
aggaagaacgtcaagatatcgacggatcctcaaacggttgcagcgagacagagaagggag 
aggataagtgagaagattagggttttacaaacattggttccaggtgggacgaagatggat 
actgcat(^tgcttgatgaagctgctaattatct(^gttccttagagcacaagtaaaa 
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GCTTTAGAAAACrTCAGACCCAAGCTC^ 

ACATCGTTTCCATTATTCCACCCATCTTTTCTTCCATTGCAAAATCCTAATCAAATCCAT 
CATCCAGAGTGTTGACAGATTATAAACTTTTGAGTTTCATCATCATCAACAGAAT(^ 
CGTCTTGATTGTTTTAGCAGTTCTCAAGAAAGGCAACTTCTGTGACAAGGGTGGTGTCGG 
GCAGTGTTGTTTACACTTTCCAGTCTTTGTTTTG^ 

TTTATATAGAATCTGTGGAATTCGAGGGTTGAAATATTGTGAAAAACAGAGCCGCAAGAG 
GTTAATTACAGTCTCTGCAATATTTTCAACCTTTTATTACTTTATTAGAGTAAAGATAGC 
GT 

>G2557 Amino Acid Sequence (domain in aa coordinates: 278-328) 

MEGLESVYAQAMYGMTRESKIMEHQGSDLIWGGNELMARELCSSSSYHHQLINPNLSSCF 

MSDLGVLGEIQQQQHVGNI^SSIDPSSLDCLLSATSNSNNTSTEDDEGISVLFSDCQTLW 

S FGGVS S AE S ENRE I TTETTTT I KP KPLKRNRGGDGGTTETTTTTTKP KS LKRNRGDETG 

SHFSLVHPQDDSEKGGFKLIYDENQSKSKKPRTEKERGGSSNISFQHSTCLSDNVEPDAE 

AIAQMKEMIYRAAAFRPWFGLEIVEKPKRKNVKIS 

LVPGGTKMDTASMLDEAANYLKFLRAQVKALENLRPKLDQTNLS FS S APTSFPLFHPS FL 

PLQNPNQIHHPEC* 

>G259 (52.. 786) 

GAGATCTTCTACTACTTGTTTTCTTCAAGAATAATAATTTTCGTTTTATATATGGA 

GCTGGTGAACATTTACGGTGTAACGATAACGTTAACGACGAGGAGCGTTTGCCATTGGAG 

TTTATGATCGGAAACTCAACATCCACGGCGGAGCTACAGCCGCCTCCACCGTTCTTGGTA 

AAGACATACAAAGTGGTGGAGGATCCGACGACGGACGGGGTTATATCTTGGAACGAATAC 

GGAACTGGTTTCGTCGTGTGGCAGCCGGCAGAATTCGCTAGAGATCTGTTACCZAACACTT 

TTCAAGCATTGCAACTTCTCTAGCTTCGTTCGCCAGCTCAATACTTACGGTTTTCGAAAA 

GTAACGACGATAAGATGGGAATTTAGTAATGAGATGTTTCGAAAGGGGCAAAGAGAGCTT 

ATGAG CAATATCCGAAGAAGGAAGAGC CAACATTGGTCACACAACAAGTCTAATCAC CAG 

GTTGTACCAACAACAACGATGGTGAATCAAGAAGGTCATCMCGGATTGGGATTGATCAT 

CACCATGAGGATCAACAGTCTTCCGCCACTTCA^ 

GACGAAAACAAATGCTTGAAGAATGAAAACGAGTTATTAAGCTGCGAACTTGGGAAAACC 

AAGAAGAAATGCAAGCAGCTTATGGAGTTGGTGGAGAGATACAGAGGAGAAGACGAAGAT 

GCAACTGATGAAAGTGATGATGAAGAAGATGAAGGGCTTAAGTTGTTCGGAGTAAAACTT 

GAATGAAACTAGATTGCTAGATTGATATTCGTAATATACCAGTTTCTTCATATTCTTAGA 

AGTTTTGCATAACTATATATAGTACTCTTTTAAGACATGCAAGATCAGAACATATG 

>G259 Amino Acid Sequence (domain in AA coordinates: 27-131) 

MEDAGEHLRCNDNVNDEERLPLEFMIGNSTSTAELQPPPPFLVKTYKVVEDPTTDGVIS^ 

NEYGTGFVWQPAEFARDLLPTLFKHCTFSSFVRQLOTYGFRKVTTIRWEFSNEMFRKGQ 

RELMSNIRRRKSQHWSHNKSNHQWPTTTMVNQEGHQRIGIDHHHEDQQSSATSSSFVYT 

ALLDENKCLKNENELLSCELGKTKKKCKQLMELVERYRGEDEDATDESDDEEDEGLKLFG 

VKLE* 

>G353 (82.. 570) 

ACCAAACTCAAAAAAC^CAAACCACAAGAGGATCATTTCATTTTTTATTGTTTCGTTTTA 
ATCATCATCATCAGAAGAAAAATGGTTGCGATATCGGAGATCAAGTCGACGGTGGATGTC 
ACGGCGGCGAATTGTTTGATGCTTTTATCTAGAGTTGGACAAGAAAACGTTGACGGTGGC 
GATCAAAAACGCGTTTTCACATGTAAAACGTGTTTGAAGCAGTTC 

TTAGGAGGTC^CCGTGCGAGTCACAAGAAGCCTAAC^^CGACGCTTTGTCGTCTGGATTG 
ATGAAGAAGGTGAAAACGTCGTCGCATCCTTGTCCCATATGTGGAGTGGAGTTTCCGATG 
GGAC7VAGCTTTGGGAGGACACATGAGGAGACACAGGAACGAGAGTGGGGCTGCTGGTGGC 
GCGTTGGTTACACGCGCTTTGTTGCCGGAGCCCACGGTGACTACGTTGAAGAAATCTAGC 
AGTGGGAAGAGAGTGGCTTGTTTGGATCTGAGTCTAGGGATGGTGGACAATTTGAATCTC 
AAGTTGGAGCTTGGAAGAACAGTTTATTGATTTT^ 

ATATTTGTTTCTCTCATTCTTTGAATTTTTCTTAATATTCTAGATTATACATACATCCGC 

AGATTTAGGAAACTTTCATAGAGTGTAATCTTTTCTTTCTGTAAAAATATAT^ 

TAGCAAA 

>G353 Amino Acid Sequence (domain in aa coordinates: 41-61, 84-104) 
MVAI SEI KSTVDVTAANCLMLLSRVGQENVDGGDQKRVFTCKTCLKQFHS FQALGGHRAS 
HKKPNWDALSSGLMKKVKTSSHPCPICGVEFPMGQAIjGGHMRRHRNESGAAG^ 
LPEPTVTTLKKS SSGKRVACLDLSLGMVDNLNLKLELGRTVY* 
>G354 (27.. 533) 
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CCTAGAAGTCACTAAGTCGATTCAAAATGGTTGCGAGAAGTGAGGAAATTGTGATAGTGG 

AAGAAGATACGACTGCGAAATGTTTGATGTTGTTATCAAGAGTCGGAGAATGCGGCGGCG 

GCTGCGGGGGAGATGAACGTGTTTTCCGATGCAAGACTTGTCTTAAAGAGTTCTCATCGT 

TTCAAGCTTTGGGAGGTCZATCGTGCAAGCCACAAGAAACTTATCAACAGTGAC 

CACTTCTTGGATCCTTGTCCAACAAGAAAACTAAAACGTCTCATCCTTGTCCGATATGTG 

GAGTGAAGTTTCCGATGGGACAAGCTCTTGGTGGTCACATGAGGAGACATAGGAACGAGA 

AAGTCTCAGGCTCGTTGGTTAC^CGTTCI^TTCTACCGGAGACGACGACGGTGACGGCTT 

TGAAGAAATTTAGTAGTGGGAAGAGAGTGGCTTGTTTGGATTTGGACTTAGATTCGATGG 

AGAGTTTGGTCAATTGGAAGTTGGAGTTGGGAAGAACGATTTCTTGGAGTTAAGTTTTTG 

GGTTGTATACAGTTTCACATGATTTTGTAATCTTTGTTGATCCAATTATCGTACCGATCG 

ATGTGAATATTATTTTGATACAATAAAA 

>G354 Amino Acid Sequence (domain in aa coordinates: 42-62, 88 
MVARSEEIVIVEEDTTAKCLMLLSRVGECGGGCGGDERVFRCKTCLKEFSSFQALGGHRA 

SHKKLINSDNPSLLGSLSNKKT^ 

SFLPETTTVTALKKFSSGKRVACLDLDLDSMESLVNWKLELGRTISWS* 
>G638 (86.. 1861) 

GAATTAAAAGGTTTAACCTTTACCTTTTTTTCCCTTCACTATCGATAATTGATCTTCTCT 
TTCGGCTGAATATAAATCTGAAAAAATGGATCAAGATCAGCATCCTCAGTACGGTATACC 
GGAGCTCCGGCAGCTCATGAAAGGCGGAGGAAGGACX3ACTACTACAACACCGTCTACTTC 

TTCTCATTTTCCCTCTGATTTCTTCGGTTTTAACCTTGCTCCGGTC 

CCGTCTTCATCAGTTCACTACTGATCAAGATATGGGTTTCTTGCCACGTGGCATACATGG 

ATTGGGTGGAGGTTCTTCAACGGCTGGAAATAACAGTAACTTAAACGCGAGTACTAGTGG 

TGGAGGAGTTGGGTTTAGTGGGTTTCTTGACGGTGGTGGTTTCGGCAGCGGAGTAGGAGG 

AGACGGTGGAGGAACTGGAAGGTGGCCGAGACAAGAAACCCTAACTCTGTTGGAAATTAG 

ATCTCGTCTTGATCATAAATTCAAAGAAGCTAATCATAAAGGACCTCTTTGGGATGAAGT 

TTCTAGGATTATGTCCGAGGAACATGGATACCAAAGGAGTGGGAAGAAATGCAGAGAGAA 

GTTTGAGAATCTGTACAAATACTATAGTAAGACTAAAGAAGGCGAAGeCGGAAGACAAGA 

CGGAAAACATCACAGATTTTTCCGCCAGCTCCAAGCGCTATACGGGGATTCTAATAACTT 

GGTTTCTTGTCCCAATCATAACAC^ 

TCAAAACCCTATGAACGTTGCTACAACAACGTCCAACATCCATAACGTTGATAGTGTTCA 
TGGTTTTCATCAAAGCCTTAGTCTTTCT 

GACTTCCTCTTCGGAAGGGAATGATTCTAGTAGTAGAAGGAAAAAGAGGAGTTGGAAAGC 

GAAGATAAAGGAGTTCATTGATACGAACATGAAAAGGTTGATAGAGAGGCAAGATGTTTG 

GCTTGAGAAGTTGACAAAGGTTATTGAAGACAAAGAGGAACAACGGATGATGAAAGAAGA 

GGAATGGAGGAAGATTGAAGCTGCAAGGATTGATAAAGAGCATTTGTTTTGGGCTAAAGA 

GAGGGCGAGGATGGAAGCTAGGGATGTTGCGGTGATTGAGGCATTGCAATACTTGACAGG 

AAAGCCATTGATAAAGCCGCTGTGTTCATCCCCGGAAGAGAGGACAAATGGTAATAATGA 

GATCCGAAACAATAGTGAGACACAGAATGAGAATGGAAGCGATCAAACGATGACTAACAA 

TGTTTGTGTTAAAGGAAGTAGTAGCTGCTGGGGTGAGCAAGAGATTTTAAAGCTTATGGA 

GATAAGAACGAGCATGGACTCGACCTTTCAAGAGATATTAGGAGGGTGCTCGGATGAGT^ 

TCTATGGGAGGAAATCGCAGCGAAGTTGATTCAGTTAGGGTTTGATCAGAGAAGTGCCTT 

ATTATG CAAGGAAAAGTGGGAATGGATAAGCAATGG AATGAGGAAAGAAAAGAAG CAAAT 

CAACAAGAAAAGAAAGGATAATTCGTCCAGCTGCGGCGTGTACTACCCGAGAAACGAAGA 

AAATCCAATCTACAATAATCGAGAAAGTGGATATAATGATAATGATCCGCATCAAATCAA 

CGAACAAGGCAATGTAGGTTCTTCAACATCAAACGCAAACGCAAACGCAAACGTAACCAC 

TGGAAATCCGAGCGGTGCAATGGCTGCTAGTACAAACTGCTTCCCGTTCTTCATGGGAGA 

TGGAGATCAGAATTTGTGGGAGAGTTATGGTTTGAGGCTCAGTAAAGAAGAGAATCAGTA 

AGTAATTTCTCTTAATGAAGAAGAAGAAGGTAATCATGTGGTTAACTAATTCTTTTGAGT 

TAGCTATATATGAGATAAACCTTGACTTAGCTATTATATGTCACATGCTGCTTAGAATTA 

AGAAATATTTGTTGGGGCTTAACGAATTATATATCAGCATATATAAGATGAGAGTCTAAG 

AATTATATCAAATTAGGCTTTAACCAACGTACGATTATATATTATGTTTTCATGTATTTA 

TTCTGTAAGACTTTTTAATATCAATCTTTCTCTAAA 

>G638 Amino Acid Sequence (domain in AA coordinates: 119-206) 

MDQDQHPQYGIPELRQLMKGGGRTTTTTPSTSSHFPSDFFGFNLAPVQPPPHRLHQFTTD 

QDMGFLPRGIHGLGGGSSTAGNNSNIiNASTSGGGVGFSGFLDGGGFGSGVGGDGGGTGRW 

PRQETLTLLEIRSRLDHKFKEANHKGPLWDEVSRIMSEEHGYQRSGKKCREKFENLYKYY 

SKTKEGEAGRQDGKHHRFFRQLQALYGDSl^LVSCPNHNTQFMSSALHGFHTQNPMNVAT 
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TTSNIHNVDSVHGFHQSLSLSNNYNSSELELMTS 

NMKRL I ERQDVWLEKLTKV I EDKEEQRMMKEEE WRKI EAARIDKEHLF WAKERARMEARD 
VAVI EALQYLTGKPL I KPLCS S PEERTNGNNE I RNNSETQNENGSDQTMTNNVCVKGS S S 
CWGEQEILKLMEIRTSMDSTFQEILGGCSDEFLWEEIAAKLIQLGFDQRSALLCKEKWEW 
I SNGMRKEKKQINKKRKDNS S SCG VYYPRNEENP I YNNRE SGYNDNDPHQI NEQGNVGS S 
TSNANANANVTTGNPSGAMAASTNCFPFFMGDGDQNLWESYGLRLSKEENQ* 
>G869 (428.. 1402) 

AGGAA(^GTGAAAGGTTCGGTTTTTTGGGTTTCGATCTGATAATCAACAAGAAAAAAGGG 
TTTGATTTATGTCGGCTGGGTTTGAATCGACTGTGATTTTGTCTTTGATTCATATCTCTT 
CTCCGATTTCATCATCATCTTCCCCATCATCGTCGTCTTTGAAATCOTGTCTTCTCAACG 
CTCTTCACTTCTGCTGTAATAAGCAGAGGCTTGTTCTGGAGACTCCTTCTCTTTCCATGC 
GCTTAAGACCCAAAAGGACTTGTTCTAGTGTTGAAGTCTTTGGGGGTTTTCACATAAAGC 
AGCAAAAGTTTTCTTTTTTCATAGTTCGCTGAGAGTO 

TTTGACCTTTTAGAGTGATTTTTTGTTCTTTCTGTTTTCTGGGTATTTTTGAGGAGTG 

TTTAACAATGGTTGCGATTAGAAAGGAACAGTCTTTGAGTGGTGTTAGTAGCGAGATTAA 

GAAGAGAGOTAAGAGAAACACTCTATCGTCCCTTCCTCAAGAAACCC^CCTTTGAGG^ 

AGTCCGTATTATTGTGAATGATCCTTATGCTACTGATGATTCCTCTAGTGATGAGGAAGA 

GCTTAAGGTTCCTAAGCCAAGGAAAATGAAACGTATCGTTCGTGAGATTAACTTTCCTTC 

TATGGAAGTTTCTGAACAGCCTTCTGAGAGTTCTTCTCAGGACAGTACTAAAACTGATGG 

CAAGATAGCTGTGTCAGCTTCTCCTGCTGTTCCTAGGAAGAAGCCTGTTGGTGTTAGGCA 

AAGGAAATGGGGGAAATGGGCTGCTGAGATTAGAGATCCTATTAAGAAAACTAGGACTTG 

GTTGGGTACTTTTGATACTCTTGAAGAAGCTGCTAAAGCTTATGATGCTAAGAAGCTTGA 

GTTTGATGCTATTGTTGCTGGAAATGTGTCCACTACTAAACGTGATGTTTCTTCATCTGA 

GACTAGCCAATGCTCTCGTTCTTCACCTGTTGTTCCTGTTGAGCAAGATGACACTTCTGC 

ATCAGCTCTCACTTGTGTCAACAACCCTGATGACGTCTCGACCGTTGCTCCAACTGCTCC 

AACTCCAAATGTTCCTGCTGGTGGAAACAAGGAAACGTTGTTCGATTTCGACTTTACTAA 

TCTACAGATCCCTGATTTTGGTTTCTTGGCAGAGGAGCAACAAGACCTAGACTTCGATTG 

TTTCCTCGCGGATGATCAGTTTGATGATTTCGGCTTGCTTGATGACATTCAAGGATTCGA 

AGATAACGGTCCAAGTGCGTTACCAGATTTCGACTTTGCGGATGTTGAAGATCTTCAGCT 

AGCTGACTCTAGTTTCGGTTTCCTTGATCAACTTGCTCCTATCAACATCTCTTGCCCATT 

AAAAAGTTTTGCAGCTTCATAGGATCTTGCTTAGTAATGTTAAGTGAGAAGAGTGTTTTG 

TTTTTTCGTTTATGCTTTAGTAATTTAAGACATACAAAAGTGTGTGTTCCGGATTGTAGT 

AAGATCTTAAGACATAAAGCCGGGTTTTGCAATTAGGAATCGAGTTTTAATGAAGTTTTA 

GTTTATGTTTG 

>G869 Amino Acid Sequence (domain in AA coordinates: 109-177) 
MVAIRKEQS LSG VS SE I KKRAKRNTLS SLPQETQPLRKVRI I VNDPYATDDS S SDEEELK 
VPKPRKMKRIVREINFPSMEVSEQPSESSSQDSTKTDGKIAVSASPAVPRKKPVGVRQRK 
WGKWAAE IRDPIKKTRTWLGTFDTLEEAAKAYDAKKLEFDAIVAGNVSTTKRDVS SSETS 
QCSRSSPVVPVEQDDTSASALTCVNNPDDVSTVAPTAPTPNVPAGGNKETLFDFDFTNLQ 
IPDFGFLAEEQQDLDFDCFLADDQFDDFGLLDDIQGFEDNGPSALPDFDFADVEDLQLAD 
SSFGFLDQLAPINISCPLKSFAAS* 
>G1645 (25.. 1104) 

CGTCGACCTCCCAACACTAACTCCATGTTTATAACGGAAAAACAAGTGTGGATGGATGAG 
ATCGTCGCAAGAAGAGCTTCTTCTTCTTGGGACTTCCCTTTCAACGACATTAATATTCAT 
CAGCATCATCATCGTCACTGCAACACAAGTCATGAGTTTGAAATCTTGAAGAGTCCTCTT 
GGAGATGTAGCGGTTCACGAAGAAGAGAGTAATAATAATAACCCTAATTTCAGTAACAGC 
GAGAGTGGTAAGAAGGAGACAACAGATAGTGGTCAGTCTTGGTCCTCGTCGTCTTCAAAA 
C(^TCGGTCTTGGGGAGAGGACATTGGAGACCAGCTGAAGATGTTAAACTCAAAGAGCTT 
GTCTCCATTTACGGCCCACAAAACTGGAACCTCATAGCTGAAAAGCTTCAAGGAAGATCT 
GGGAAGAGCTGTAGACTACGATGGTTTAACCAATTGGACCCGAGGATAAACCGAAGAGCT 
TTCACAGAAGAAGAAGAGGAGAGGCTGATGCAAGC^CATAGGCTTTATGGTAACAAATGG 
GCAATGATTGCGAGGCTTTTCCCTGGTAGAACTGATAATTCAGTGAAGAACCATTGGCAT 
GTTGTCATGGCTCGTAAGTATAGAGAACACTCTTCTGCTTACCGTAGGAGAAAGCTTATG 
AGTAATAATCCACTTAAACCTCACCTCACCAATAATCATCATCCTAACCCTAACCCTAAT 
TACCACTCTTTTATCTCCACTAATCATTACTTCGCTCAGCCTTTCCCCGAGTTTAATTTG 
ACTCATCACCTGGTTAATAATGCCCCTATCACGAGTGACCATAACCAGCTTGTGTTGCCT 
TTCCATTGCTTTCAAGGTTATGAGAACAATGAACCTCCGATGGTTGTGAGTATGTTTGGC 
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AACCAAATGATGGTCGGCGATAACGTTGGTGCC^CGTCAGACGCGTTATGCAATATTCCG 
CACATTGACCCTAGTAACCAAGAGAAACCGGAGCCAAATGATGCAATGCATTGGATCGGA 
ATGGACGCGGTAGATGAGGAGGTGTTCGAAAAGGCTAAGCAGCAACCACATTTTTTCGAT 
TTTCTTGGCTTGGGGACGGCGTGAATGTTGAACAAATTGGTGTTAATCAGATAACGACAG 

TGGC 

>G1645 Amino Acid Sequence (domain in AA coordinates: 90-210) 
MFITEKQVWMDEIVARRASSSWDFPFNDINIHQHHHRHCOT 
ESNNIWPNFSNSESGKKETTDSGQSWSSSSSKPSVLGRGHWRPAEDV^ 
WNLIAEKLQGRSGKSCRLRWFNQLDPRINRRAFTEEEEERLMQAHRLYGNKWAMIARL 

GRTDNSVKNHWHVVMARKYREHSSAYRRRKLMSNNPLKPH^ 

HYFAQPFPEFNLTHHLVNNAP I TSDHNQLVLPFHC FQGYENNEPPMWSMFGNQMMVGDN 
VGATSDALCNIPHIDPSNQEKPEPNDAMHWIGMDAVDEEVFEKAKQQPHFFDFLGLGTA* 

>G1038 (240. .1574) 

GCTCGTTTTCAAATTAAAAACAGGGAGAAATTTGGAAATTCCAGTACGACGGGAGATAAA 

ACCTAACATACGCCATGGTGACCGTTATCTAAACTACGCCAAAATATTTGAAGTGTCGTC 

GTTTCATAATAAAACGCAAACAAAAACCCACTCCCACTTTCTCCTTTCCAAAAAAAGAAC 

TCTCGCCACTTTCTCTGCTCTTTTCTTTCTCTCTCTCTTTCTTGTTTTCGCCGGCGATCA 

TGGAGAAAAGCGGCTTCTCTCCCGTCGGTCTAAGGGTTCTTGTCGTAGACGATGATCCAA 

CTTGGCTCAAGATTCTCGAGAAAATGCTCAAGAAGTGTTCTTACGAAGTAACGACCTGTG 

GATTAGCTAGAGAGGCTTTGAGGTTGCTGAGGGAGCGTAAAGATGGATATGATATCGTGA 

TCAGCGATGTG AACATGC CTGACATGGATGGTTTCAAGC TTCTTGAGCATGTTGGTCTTG 

AATTAGACCTCCCTGTAATAATGATGTCGGTGGACGGCGAAACAAGCCGAGTGATGAAGG 

GAGTGCACACGGGAGCTTGTGATTACCTCTTGAAGCCGATAAGAATGAAGGAGTTAAAGA 

TTATATGGCAACATGTTCTGAGAAAGAAGCTTCAAGAAGTGAGAGATATCGAAGGCTGTG 

GATACGAAGGAGGAGCGGATTGGAT(^CTCGATACGATGAAGCA(^TTTTCTTGGAGGTG 

GTGAAGATGTTTCTTTTGGGAAAAAGAGAAAAGACTTTGACTTTGAGAAGAAGCTTCTTC 

AAGATGAGAGTGATCCATCATCTTCTTCTTCCAAGAAAGCTAGAGTTGTTTGGTCTTTTG 

AGCTTCATCATAAGTTTGTCAACGCCGTTAACCAAATCGGATGCGATCACAAAGCTGGTC 

CCAAGAAGATATTGGATCTCATGAATGTTCCATGGCTCACTAGAGAAAATGTTGCAAGCC 

ACCTTCAGAAATATAGACTTTACCTGAGCAGATTAGAGAAAGGAAAGGAGCTCAAGTGTT 

ATTCAGGTGGCGTGAAGAATGCGGATTCATCTCCAAAAGATGTCGAAGTGAATTCAGGCT 

ACCAAAGCCCTGGGAGGAGCAGCTATGTATTCTCTGGAGGAAATTCTCTGATCCAAAAAG 

CAACAGAGATTGATCCAAAGCCACTTGCTTCAGCTTCTTTGTCTGACCCCAACACCGATG 

TGATCATGCCTCCGAAAACAAAAAAGACGCGTATAGGATTTGATCCTCCCATTTCCTCCT 

CTGCGTTTGACTCTCTGCTTCCTTGGAATGATGTTCCAGAGGTCCTTGAATCGAAGCCGG 

TTCTGTATGAGAATAGCTTTCTCCAGCAACAACCATTGCCAAGTCAAAGTTCCTATGTTG 

CAATTTCTGCACCATCTCTCATGGAGGAGGAAATGAAGCCTCCTTATGAGACACCAGCAG 

GAGGCAGTAGTGTGAATGCAGATGAGTTTCTCATGCCACAAGACAAGATCCCTACTGTAA 

CCCTTCAAGATTTGGATCCCTCTGCCATGAAGCTG^ 

CTGAAGAAGCTTGAACTGGGGAACTTCCAGAATCACATCATTCTGTTTCTTTAGACACTG 

ACTTAGACTTGACTTGGCTTCAAGGCGAGCGTTTCTTGCAAACACCGACTCCAGTTTCAA 

GATACAGTAGTAGCCCATCACTCCTATCTGAGCTCCCAGCCCACCTTAATTGGTATGGAA 

ATGAGCGGCTGCCTGACCCTGACGAGTATTCCTTCATGGTAGACCAAGGTTTATTCATAT 

CTTAACCTTGTTCCAATAACTTCTTTTCGTATATTGGTTGGTGTAATGCAGAAAGATTTT 

GTGGGTATACCTGAAAATAATCTTGCTTTCCCAAGAACCTTCCATGATCGGATGCATTGT 

ACAATAATCCACGAGTGTCGTAGGCTAATTACACCAAACAGGTTGATGACAGTGATAAGG 

CCACATGTTTCACACCGTCGCTTAAGATCTTTACTGTCACCTGGAAGGAAA 

>G1038 Amino Acid Sequence (domain in AA coordinates: 198-247) 

MEKSGFS PVGLRVL WDDDPTWLKI LEKMLKKCS YE VTTCGLAREALRLIiRERKDGyD IV 

ISDVNMPDMDGFKLLEHVGLELDLPVIMMSVDGETSRVMKGVHTGACDYLLKPI 

IIWQHVLRKKLQETODIEGCGYEGGADWITRYDEAHFLGGGEDVSFGKKRKDFDFEKKLL 

QDESDPS S SS S KKARWWS FELHHKFVNAWQ IGCDHKAGPKKILDLMNVPWLTRENVAS 

HLQKYRLYLSRLEKGKELKCYSGGVKNADSSPKDVEVNSGYQSPGRSSYVFSGGNSLIQK 

ATEIDPKPLASASLSDPNTDVIMPPKTKOTRIGFDPPISSSAFDSLLPWNDVPEVLESKP 

VLYENSFLQQQPLPSQSSYVAISAPSLMEEEMKPPYETPAGGSSVNADEFLMPQDKIPTV 

TLQDLDPSAMKLQEFNTEGDSEEA* 
>G1073 (62. .874) 
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CCCCCCGACCTGCCTCTACAGAGACCTGAAGATTCCAGAACCCCACCTGATCAAAAATAA 
CATGGAACTTAACAGATCTGAAGCAGACGAAGCAAAGGCCGAGACCACTCCCACCGGTGG 
AGCCACCAGCTCAGCCACAGCCTCTGGCTCTTCCTCCGGACGTCGTCCACGTGGTCGTCC 
TGCAGGTTCCAAAAACAAACCCAAACCTCCGACGATTATAACTAGAGATAGTCCTAACGT 
CCTTAGATCACACGTTOTTGAAGTCACCTCCGGTTCGGACATATCCGAGGCAGTCTCCAC 
CTACGCCACTCGTCGCGGCTGCGGCGTTTGCATTATAAGCGGCACGGGTGCGGTCACTAA 
CGTCACGATACGGCAACCTGCGGCTCCGGCTGGTGGAGGTGTGATTACCCTGCATGGTCG 
GTTTGACATTTTGTCTTTGACCGGTACTGCGCTTCCACCGCCTGCACCACCGGGAGCAGG 
AGGTTTGACGGTGTATCTAGCCGGAGGTCAAGGACAAGTTGTAGGAGGGAATGTGGCTGG 
TTCGTTAATTGCTTCGGGACCGGTAGTGTTGATGGCTGCTTCTTTTGCAAACGC^GTTT^ 
TGATAGGTTACCGATTGAAGAGGAAGAAACCCCACCGCCGAGAACCACCGGGGTGCAGCA 
GCAGCAGCCGGAGGCGTCTCAGTCGTCGGAGGTTACGGGGAGTGGGGCCCAGGCGTGTGA 
GTCAAACCTCCAAGGTGGAAATGGTGGAGGAGGTGTTGCTTTCTACAATCTTGGAATGAA 
TATGAACAATTTTCAATTCTCCGGGGGAGATATTTACGGTATGAGCGGCGGTAGCGGAGG 
AGGTGGTGGCGGTGCGACTAGACCCGCGTTTTAGAGTTTTAGCGTTTTGGTGACACCTTT 
TGTTGCGTTTGCGTGTTTGACCTCAAACTACTAGGCTACTAGCTATAGCGGTTGCGAAAT 
GCGAATATTAGGTT 

>G1073 Amino Acid Sequence (domain in AA coordinates: 33-42, 78-175) 

MELNRSEADEAKAETTPTGGATSSATASGSSSGRRPRGRPAGSKNKPKPPTIITRDSPNV 

LRSHVLEVTSGSDI SEAVSTYATRRGCGVCI ISGTGAVTNVTIRQPAAPAGGGVITLHGR 

FD IIiSLTGTALPPPAPPGAGGLTVYLAGGQGQWGGNVAGSLI ASGPVVLMAAS FANAVY 

DRLPIEEEETPPPRTTGVQQQQPEASQSSEVTGSGAQACESNLQGGNGGGGVAFYNLGMN 

MNNFQFSGGDIYGMSGGSGGGGGGATRPAF* 

>G1146 (129.. 3095) 

cttctctagcgtcactcttcttcttcattggtcggtagaataaggccaaggaagggatca 
gttttaagttttgtttcattctttttgtagtggagaaaaagagtttttgaaaatcaaaac 
aacaaaaaatgccgattaggcaaatgaaagatagctctgagactcacttagttatcaaaa 
cccaacctttaaagcaccacaatccaaaaaccgttcaaaacggtaaaatccctcctcctt 
ctccttctccggtgacggtgactactccggcgacggttactcagagtcaagcttcttcac 
cttcaccaccgtcaaagaatcgtagccggaggagaaaccgtggtggaagaaaatctgatc 
aaggagatgtttgtatgagacctagctctcgtcctcgtaaaccgccaccgccaagtcaaa 
ccacttcctccgccgtctccgtcgccaccgccggtgagattgtcgctgtgaatcatcaga 
tgcagatgggtgttcgtaaaaactcaaactttgctccaagacctggatttggaacacttg 
gaactaaatgcattgttaaagctaaccactttctcgctgatttgcctaccaaggatttga 
atcagtatgatgttacaattactcctgaagtgtcatcaaagagtgttaacagagctataa 
ttgctgagttagttagactttacaaagagtctgatctcgggaggagacttccggcttacg 
atggccggaaaagtctttacactgctggagaacttccttttacttggaaggagttcagtg 
ttaagattgttgatgaagatgacggtatcatcaatggccctaaaagggagagatcatata 
aggtggcaatcaagtttgttgcacgggcaaatatgcatcacttaggcgagtttctagctg 
gtaaacgggcagattgtccgcaagaggcggtgcagattcttgatattgtactcagggagt 
tgtcggttaagaggttttgtcccgttggaagatctttcttttcgcctgatattaaaacac 
cgcagcgactcggtgaagggttagagtcatggtgtgggttttaccagagtattagaccaa 
ctcaaatgggtttatcactaaatatcgatatggcttcagctgcattcatcgagcctcttc 
cagtgatagagtttgtagcacagcttcttggaaaggatgtcttgtcgaagccattgtcgg 
attctgatcgcgtcaagattaagaagggtcttagaggagtgaaagtagaggttactcaca 
gagcgaatgtaagaaggaaataccgtgttgcgggtttaacaactcaaccaacaagagagc 
taatgtttccagtagatgagaactgtactatgaagtcagttattgagtatttccaagaga 
tgtatggattcacgatccagcacacgcatttgccatgtctccaagttggaaaccaaaaga 
aggcaagctatttgccgatggaggcatgcaaaattgtcgagggacaacggtacacgaaaa 
ggttgaatgagaagcagattactgctctcttgaaagttacatgccaaagggccgagggac 
agagaaacgatattttgcggactgtccaacacaacgcatatgatcaagatccatatgcaa 
aggagtttggcatgaacataagcgaaaagttagcttctgttgaagctcgtattcttccag 
ctccatggcttaagtatcacgagaacgggaaagaaaaagattgtctcccgcaagttggtc 
agtggaatatgatgaacaagaaaatgatcaacgggatgactgtgagcagatgggcctgtg 
ttaacttctcacgcagcgttcaagaaaacgttgctcgtggattttgtaatgaacttggtc 
agatgtgtgaagtctcaggcatggagtttaatccagaacccgtgataccaatatatagtg 
cgaggcccgatcaagtcgagaaagctctaaagcatgtttatcacacttcaatgaacaaaa 
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ccaaaggcaaagagttagagcttctgctggcaatattacctgataacaacggttcacttt 

atggtgatcttaagagaatctgtgaaaccgagcttggtttgatatctcaatgttgtctca 

caaaacatgtgttcaagattagcaaacagtatctggcagatgtatcccttaaaatcaacg 

taaagatgggaggaaggaacacagttctagtagacgccataagctgtagaattccactgg 

ttagcgatataccgacaatcatttttggcgcagacgtgactcacccagagaacggggaag 

agtcaagcccttcaatcgctgctgttgttgcttctcaagactggcctgaagtgacaaaat 

atgcgggtttagtttgtgctcaagctcacaggcaagaacttatacaagatttgtataaaa 

catggcaagatcctgttcgcggtactgttagtggcggtatgatcagggaccttcttatct 

catttagaaaagcaacagggcaaaaaccgcttcgaattatcttttatcgtgatggagtaa 

gcgaagggcaattctatcaagttttactctatgagttggatgcaattcgaaaggcttgtg 

catcgcttgaaccgaattatcagccaccggtgacattcatagttgtacagaagcgtcacc 

acactcgtttgtttgctaataatcaccgagacaaaaacagtactgaccgaagcggaaata 

tcttaccaggtactgtagttgacactaaaatatgtcatccaactgaattcgacttctacc 

tttgtagccatgcgggtattcagggaacaagcaggcctgcacattaccatgttctttggg 

acgagaacaatttcacagcagatggtattcaatctctgactaacaatctctgttatacct 

atgcgcggtgcactcggtcggtctctatagttcctccagcgtattatgctcatcttgcag 

catttcgagcacgtttctacctggaacctgagataatgcaagacaacggatcaccgggta 

aaaagaacacgaaaacaacaactgtcggagacgtaggtgtgaagcctttaccagccttga 

aggagaatgtgaagagagtaatgttctactgctaaaaatccaaacattccttaatcagtt 

ttaataagtagtttggttgtttgcttgtagttcggctttagatttaccaatgtttttctt 

atgtaaattttgtcggtttggtttaagcctttaggaattagtgtattagggtttttctaa 

agttgtactttagctgatgataacgttgatgcagtgactttgttaaaacctcctcttcta 

cagtagtgt t tacgtcgt tcctc 

>G1146 Amino Acid Sequence (domain in AA coordinates: 886-896) 
MP IRQMKDS SETHLVI KTQPLKHHNPKTVQNGKI PPPS PSPVTVTTPATVTQS QAS S PS P 
PSKNRSRRRNRGGRKSDQGDVCMRPSSRPRKPPPPSQTTSSAVSVATAGEIVAVNHQMQM 
GVRKNSNFAPRPGFGTLGTKCIVKANHFLADLPTKDLNQYDVTITPEV 
LVRLYKESDLGRRLPAYDGRKSLYTAGELPFTWKEFSVKIVDEDDGIINGPKRERSYKVA 

IKFVARANMHHLGEFLAGKRADCPQ 

LGEGLESWCGFYQSIRPTQMGLSLNIDMASAAFIBPLPVIEFVAQLLGKDVLSKPLSDSD 
RVKIKKGLRGVKVEVTHRANVRRKYRVAGLTTQPTRELMF 

FTIQHTHLPCLQVGNQKKASYLPMEACKIVEGQRYTKRLNEKQITALLKVTCQRAEGQRN 
DI LRTVQHNAYDQDPYAKEFGMNI SEKLASVEARILPAPWLKYHENGKEKDCLPQVGQWN 
l^KKMINGMTVSRWACWFSRSVQENVARGFCNELGQMCEV^ 

DQVEKALKHVYHTSMNKTKGKELELLLAILPDI^GSLYGDLKRICETELGLISQCCLTKH 
VFKI S KQYLADVSLKINVKMGGRNTVLVDAI S CRI PLVS D I PTI I FGADVTHPENGEE S S 
PS IAAWASQDWPEVTKYAGLVCAQAHRQEL I QDLYKTWQDPVRGTVSGGMIRDLLI S FR 
KATGQKPLRI I FYRDGVSEGQFYQVLIjYELDAIRKACASLEPNYQPPVTF I WQKRHHTR 
LFANNHRDKNSTDRSGNILPGTVVDTKICHPTEFDFYLCSHAGIQGT 
NFTADGIQSLTNNLCYTYARCTRSVSIVPPAYYAHLAAFRARFYLEPEIMQDNGSPGK^ 

TKTTTVGDVGVKPLPALKENVKRVMFYC* 
>G1267 (152.. 967) 

AAGTAGAGAATAATAATCACATCAAGATTGTTTATAACCCTCCCCNTAATCACCTTCTTA 
NTNACGACCCTCTCCGGCTCTCAACAGAACAACAAC^ 

TTCCGGCGAAATCGGACGGTCGAGATCAATCATGCATCGTAGAGCAGCAATTCAAGAATC 
GGATGACGAAGAAGATGAGACTTACAACGACGTCGTTCCTGAATCTCCTTCTTCTTGTGA 
AGACTCAAAGATCTCAAAACCAACTCCAAAGAAAAGGAGGAACGTAGAGAAGAGAGTTGT 
CTCAGTTCCGATAGGTGACGTGGAAGGATCTAAGAGCAGAGGCGAAGTATATCCACCGTC 
CGATTCATGGGCCTGGAGAAAGTACGGACAAAAACCGATCAAAGGCTCGCCTTATCCCAG 
GGGATATTACAGATGTAGTAGCTCAAAAGGATGTCCGGCGAGGAAGCAGGTGGAGAGAAG 
CCGTGTGGACCCTTCTAAGCTTATGATTACTTACGCCTGCGACCACAATCACCCTTTCCC 
TTCCTCCTCCGCTAACACCAAATCCCACCACCGCTCCTCCGTCGTCCTCAAAACCGCAAA 
GAAAGAGGAAGAATACGAAGAGGAGGAAGAAGAACTAACCGTCACCGCCGCAGAGGAACC 
ACCGGCGGGACTTGATCTAAGCCACGTAGACTCACCGTTGCTATTAGGCGGCTGCTACAG 
CGAAATCGGAGAGTTCGGGTGGTTCTACGACGCGTCGATCTCATCATCATCTGGTTCTTC 
GAATTTCCTCGACGTAACTCTAGAGAGAGGTTTTTCAGTAGGCCAAGAGGAAGATGAGTC 
TTTGTTCGGTGATCTCGGTGATTTACCTGATTGCGCCTCCGTGTTCCGCCGTGGGACTGT 
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TGCGACGGAGGAGCAACATCGAAGATGTGATTTTGGCGCCATTC 

TAGATGAGTTTGTGTGTGTAGCCAAAACCAAAAGAAAAAAACACAATT 

ACTGTAAAGGTGTATCAATGGTGGATTCATTTTTTTAAAAAAAAAAAAAAAA 

>G1267 Amino Acid Sequence (domain in AA coordinates: 70-127) 

MHRRAAIQESDDEEDETYOTVVPESPSSCEDSKISKPTPKKRRNVEKRW 

KSRGEVYPPSDSWAWRKYGQKPIKGSPYPRGYYRCSSSKGCPARKQVERSRVDPSKLMIT 

YACDHiraPFPSSSAOTKSHHRSSVVLKTAKKEEEyEEEEEELTVTAAEEPPAGLDLSHVD 

SPLLLGGCYSEIGEFGWFYDASISSSSGSSNFLDVTLERGFSVGQEEDESLFGDLGDLPD 

CASVFRRGTVATEEQHRRCDFGAIPFCDSSR* 

>G1269 (88.. 951) 

AACAATTCTCTCTCTCTTTATTCTTCTTCTTCAGCTTCAGATTTCAGATCTTAAATCTTC 

AAGTCnTCTTCTTCTTCTTCTGCAACC^TGGCTATGCAGGAACGTTGTGAGAGTT^ 

TCTGATGAACTTATATCTTCCTCAGATGCCTTTTACCTCAAGACAAGAAAGCCTTATACC 

ATCACTAAACAAAGAGAGAAATGGACAGAAGCAGAGCATGAGAAGTTTGTAGAAGCATTG 

AAACTCTATGGCAGAGCTTGGAGACGAATCGAAGAACATGTTGGAACAAAAACTGCAGTT 

CAGATTCGAAGCCATGCGCAGAAGTTCTTTACTAAGGTTGCTCGCGATTTTGGTGTTAGC 

TCTGAGTCCATTGAGATCCCGCCTCCAAGGCCAAAGAGAAAGCCGATGCATCCTTACCCT 

AGAAAGCTTGTGATTCCTGATGCAAAAGAGATGGTATACGCTGAACTAACCGGATCCAAG 

CTGATTCAGGATGAAGATAACCGATCTCCAACATCGGTTTTATCAGCTCATGGCTCAGAT 

GGATTAGGTTCCATTGGTTCAAATTCACCTAACTCTTCTTCAGCTGAGTTATCATCTCAC 

ACAGAGGAATCATTGTCTCTAGAAGCAGAGACCAAACAGAGCCTTAAGCTCTTTGGAAAA 

ACTTTTGTAGTTGGTGATTACAACTCTTCAATGAGTTGTGATGATTCTGAAGATGGCAAG 

AAGAAGCTATACTCAGAAACACAGTCTCTTCAATGTTCTTCTTCTACTTCAGAAAACGCT 

GAAACAGAAGTGGTAGTGTCGGAGTTCAAAAGAAGTGAGAGATCAGCTTTCTCTCAGTTA 

AAATCGTCGGTGACTGAGATGAACAACATGAGAGGGTTCATGCCTTACAAAAAGAGAGTA 

AAGGTGGAAGAAAACATTGACAATGTAAAATTATCATATCCTTTGTGGTGAAGTGTTCGT 

TTGTGTCAAGTCAGTTGTGTAAACTCTTTTGATCTCAACATCAGATTATGTGTATAATGT 

CAGAGTATTAGGGAAAGTTTTTTTGGATTAGATTCGTAAGATCACTCCAAAGTTTCGTGT 

CTTTCCATATAACCAGTTAGAAATTGAGATCCTTGTACTTAAACATTTTTATTTGATCAA 

TCAAATCTTCTTGATGAAAAAAAAAA 

>G1269 Amino Acid Sequence (domain in AA coordinates: 27-83) 

MAMQERCESLCSDELISSSDAFYLKTRKPYTITKQREKWTEAEHEKFVEALKIiYGRAWRR 

IEEHVGTKTAVQIRSHAQKFFTKVARDFGVSSESIEIPPPRPKRKPMHPYPRKLVIPDAK 

EMVYAELTGSKLIQDEDNRSPTSVLSAHGSDGLGSIGSNSPNSSSAELSSHTEESLSLEA 

ETKQSLKLFGKTFWGDYNSSMSCDDSEDGKKKLYSETQSLQCSSSTSENAETEVVVSEF 

KRSERSAFSQLKSSVTEMNNl^GFMPYKKRVKVEENIDNVK^ 

>G1452 (175.. 1296) 

ATTTATTAAGCATCAATGAGAGAACTTCAGAGCTGGGTTTGAGTTCTGTCCAATAATACA 
TAACCACGTTATCATTTTTGTCCTTTACTATCTCATTACACTCTTCTGTTATTCGCCCAA 
TTCTTACAGTCATTACTCTCTATAGGGCTCGAGCGGCCGCCCGGGCAGGTTTCTATGCAG 
ATGGTTCACACTTCCCGCTCCATTGCCCAGATTGGGTTCGGTGTTAAGTCGCAATTAGTA 
CTCACTATAGGGCTCGAGCGGCCGCCCGGGCAGGTAAAAGATCAAACAATGTCTAAAGAA 
GCTGAGATGTCGATCGCGGTGTCGGCTTTGTTCCCTGGTTTTAGATTCTCTCCTACTGAT 
GTTGAACTTATCTCGTACTATCTTCGTCGTAAAATCGATGGTGATGAGAACTCTGTTGCT 
GTGATTGCTGAGGTCGAGATTTACAAGTTCGAGCCGTGGGACTTGCCAGAGGAATCGAAA 
CTGAAATCGGAGAACGAGTGGTTTTACTTCTGCGCGAGGGGGAGGAAGTACCCGCACGGG 
TCACAAAGCCGGCGAGCCACACAGCTAGGATATTGGAAAGCGACCGGTAAAGAGCGGAGT 
GTTAAATCCGGGAACCAAGTTGTTGGAACCAAGAGAACGCTTGTATTTCATATCGGTCGG 
GCTCCTCGTGGCGAGAGAACGGAGTGGATTATGCATGAATACTGCATCCATGGAGCCCCA 
CAGGATGCATTAGTGGTGTGCCGGTTAAGAAAAAATGCTGATTTTCGGGCTAGTTCGACC 
CAAAAAATTGAGGATGGTGTTGTGCAAGACGATGGCTACGTTGGCCAAAGAGGTGGTTTG 
GACAAGGAGGACAAATCCTACTATGAATCTGAGCATCAGATACCAAATGGTGACATCGCA 
GAATCATCAAATGTTGTTGAGGATCAGGCCGATACCGATGATGATTGTTACGCCGAGATT 
CTGAACGATGATATAATAAAGCTCGACGAAGAAGCGTTGAAAGCTAGCCAAGCGTTTCGA 
CCAACTAATCCAACTCATCAAGAAACAATATCAAGCGAGTCATCGAGTAAGAGGTCAAAA 
TGTGGTATAAAAAAAGAATCAACGGAAACAATGAATTGTTACGCTTTGTTCAGGATCAAG 
AACGTTGCCGGAACCGACTCCAGCTGGAGATTCCCGAACCCGTTCAAAATCAAGAAAGAT 



40 



WO 03/013227 



41/286 



PCT/US02/25805 



GATAGCCAGAGATTGATGAAGAATGTTCTGGCCACTACTGTTTTCTTGGCTATCTTATTT 

TCTTTCTTTTGGACTGTATTAATAGCTAGGAACTAAAGCTAGTTACGACATACATATTAT 

TTATACATAAATAAATATAGTATTTTGTCTATGGCAAAAAAAAAAAAAAAA 

>G1452 Amino Acid Sequence (domain in AA coordinates: 30-177) 

MQMVHTSRS IAQIGFGVKSQLVLTIGLERPPGQVKDQTMSKEAEMS IAVSALFPGFRFSP 

TDVELISYYLRRiaDGDENSVAVIAEVEIYKFEPWDLPEESKLKSENEWFYFC^RGRKYP 

HGSQSRRATQLGYWKATGKERSVKSGNQWGTKRTLVFHIGRAPRGERTEWIMHEYCIHG 

APQDALWCRLRKNADFRASSTQKIEDGWQDDGYVGQRGGLDKEDKSYYESEHQIPNGD 

IAESSNVVEDQADTDDDCYAEILNDDIIKLDEEALKASQAFRPTNPTHQETISSESSSKR 

SKCGIKKESTETMNCTALFRIKNVAGTO^ 

LFSFFWTVLIARN* 

>G1494 (114.. 1406) 

TCGACAGAGTTGTGTTGGGCGTGGAACTTGGACTAGTTCCACATATCAGGTTATATAGAT 
CTTCTCTTTCAACTTCTGATTCGTCCAGAAGCTTTCCTAATCTGAGATCTGACATGGAAC 
ACCAAGGTTGGAGTTTTGAGGAGAATTATAGTTTGTCCACTAATAGAAGATCTATCAGGC 
CACAAGATGAACTAGTGGAGTTATTATGGCGAGATGGACAAGTGGTTCTGCAGAGCCAAA 
CTCATAGAGAACAAACCCAAACCCAGAAACAAGATCATCATGAAGAAGCCCTAAGATCCA 
GCACCTTTCTTGAAGATCAAGAAACTGTCTCTTGGATCCAATACCCTCCAGATGAAGACC 

CATTCGAACCCGACGACTTCTCCTCCC^C^ 

CAACCTCAGAGACGGTTAAGCCTAAGTCCAGTCCTGAACCTCCTCAAGTCATGGTTAAGC 

CTAAGGCCTGTCCTGACCCTCCTCCTCAAGTCATGCCTCCTCCAAAATTTAGGTTAACAA 

ATTCATCATCGGGGATTAGGGAAACAGAAATGGAACAGTACTCGGTAACGACCGTTGGAC 

CTAGCCATTGCGGAAGCAACCCATCACAGAACGATCTCGATGTCTCAATGAGTCATGATC 

GAAGCAAAAACATAGAAGAAAAGCTTAATCCGAACGCAAGTTCCTCATCAGGTGGCTCCT 

CTGGTTGC^GCTTTGGCAAAGATATCAAAGAAATGGCTAGTGGAAGATGCATCAC^ACCG 

ACCGTAAGAGAAAACGTATAAATCACACTGACGAATCTGTATCTCTATCAGATGCAATCG 

GTAACAAGTCGAACCAACGATCAGGATCAAACCGAAGGAGTCGAGCAGCTGAAGTTCATA 

ATCTCTCCGAAAGGAGGAGGAGAGATAGGATCAATGAGAGAATGAAGGCTTTGCAAGAAC 

TAATACCTCACTGCAGTAAAACTGATAAAGCTTCGATTTTAGACGAAGCCATAGATTATT 

TGAAATCACTTCAGTTACAGCTTCAAGTGATGTGGATGGGGAGTGGAATGGCGGCGGCGG 

CGGCTTCGGCTCCGATGATGTTCCCCGGAGTTCAACCTCAGCAGTTCATACGTCAGATAC 

AGAGCCCGGTACAGTTACCTCGATTTCCGGTTATGGATCAGTCTGCAATTCAGAACAATC 

CCGGTTTAGTTTGCCAAAACCCGGTACAAAACCAGATCATCTCCGACCGGTTTGCTAGAT 

ACATCGGTGGGTTCCCACACATGCAGGCCGCGACTCAGATGCAGCCGATGGAGATGTTGA 

GATTTAGTTCACCGGCGGGACAGCAAAGTCAACAACCGTCGTCTGTGCCGACGAAGACCA 

CCGACGGTTCTCGTTTGGACCACTAGGTTGGTGAGCCACTTTGC 

>G1494 Amino Acid Sequence (domain in aa coordinates: 261-311) 

MEHQGWSFEElTySLSTNRRSIRPQDELVELLWRDGQVVLQSQTHREQTQTQKQDHHEEAL 

RSSTFLEDQETVSWIQYPPDEDPFEPDDFSSHFFSTOTPLQRPTSETVKPKSSPEPPQVM 

VKPKACPDPPPQVMPPPKFRLTNSSSGIRETEMEQYSVTTVGPSHCGSNPSQNDLDVSMS 

HDRSKNIEEKLNPNASSSSGGSSGCSFGKDIKEMASGRCITTDRKRKRIimTDESVSLSD 

AIGNKSNQRSGSNRRSRAAEVHNLSERRRRDRINERMKAL^ 

DYLKSLQLQLQVMWMGSGMAAAAASAPMMFPGVQPQQFIRQIQSPVQLPRFPVMDQSAIQ 
NNPGLVCQNPVQNQI I SDRFARYIGGFPHMQAATQMQPMEMLRFS SPAGQQSQQPS SVPT 
KTTDGSRLDH* 
>G1548 (1..2511) 

ATGGCAATGTCTTGCAAGGATGGTAAGTTGGGATGTTTGGATAATGGGAAGTATGTGAGG 
TATACACCTGAAC^GTTGAAGCACTTGAGAGGCTTTATCATGACTGTCCTAAACCGAGT 
TCTATTCGCCGTCAGCAGTTGATCAGAGAGTGTCCTATTCTCTCTAACATTGAGCCTAAA 
CAGATCAAAGTGTGGTTTCAGAACCGAAGATGTAGAGAGAAACAAAGGAAAGAGGCTTCA 
CGGCTTCAAGCTGTGAATCGGAAGTTGACGGCAATGAACAAGCTCTTGATGGAGGAGAAT 
GACAGGTTGCAGAAGCAAGTGTCACAGCTGGTCCATGAAAACAGCTACTTCCGTCAACAT 
ACTCCAAATCCTTCACTCCCAGCTAAAGACACAAGCTGTGAATCGGTGGTGACGAGTGGT 
CAGCACCAATTGGCATCTCAAAATCCTCAGAGAGATGCTAGTCCTGCAGGACTTTTGTCC 
ATTGCAGAAGAAACTTTAGCAGAGTTTCTTTCAAAGGCAACTGGAACCGCTGTTGAGTGG 
GTTCAGATGCCTGGAATGAAGCCTGGTCCGGATTCCATTGGAATCATCGCTATTTCTCAT 
GGTTGCACTGGTGTGGCAGCACGCGCCTGTGGCCTAGTGGGTCTTGAGCCTACAAGGGTT 
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GCAGAGATTGTCAAGGATCGTCCTTCGTGGTTCCGCGAATGTCGAGCTGTTGAAGTTATG 
AACGTGTTGCCAACTGCCAATGGTGGAACCGTTGAGCTGCTTTATATGCAGCTCTATGCA 
CCAACTACATTGGCCCCACGACGCGATTTCTGGCTGTTACGTTACACCTCTGTTTTAGAA 
GATGGCAGCCTTGTGGTGTGCGAGAGATCTCTTAAGAGCACTCAAAATGGTCCTAGTATG 
CCACTGGTTCAGAATTTTGTGAGAGCAGAGATGCTTTCCAGTGGGTACTTGATACGGCCT 
TGTGATGGTGGTGGCTCAATCATACACATAGTGGATCATATGGATTTGGAGGCTTGTAGC 
GTGCCTGAGGTCTTGCGCCCGCTCTATGAGTCACCCAAAGTACTTGCACAGAAGACAACA 
ATGGCGGCACTGCGTCAGCTCAAGCAAATAGCTCAGGAGGTTACTCAGACTAATAGTAGT 
GTTAATGGGTGGGGACGGCGTCCTGCTGCCTTAAGAGCTCTCAGCCAGAGGCTAAGCAGA 
GGCTTCAATGAAGCTGTAAATGGTTTCACTGATGAAGGATGGTCAGTGATAGGAGATAGC 
ATGGATGATGTCACAATCACTGTAAACTCTTCTCCAGAC^ 

ACATTTGCCAATGGCTTTGCTCCTGTAAGC^TGTTGTTTTATGCGCAAAAGCATCAATG 
CTTTTACAGAATGTTCCTCCGGCGATCCTGCTTCGGTTTCTGAGGGAGCATAGGTCAGAA 
TGGGCTGACAACAACATTGATGCGTATCTAGCAGCAGCAGTTAAAGTAGGGCCTTGTAGT 
GCCCGAGTTGGAGGATTTGGAGGGCAGGTTATACTTCCACTTGCTCATACTATTGAGCAT 
GAAGAGTTTATGGAAGTCATCAAATTGGAAGGTCTTGGTCATTCCCCTGAAGATGCAATC 
GTTCCAAGAGATATCTTCCTTCXTCAACTTTGTAGCGGAATGGATGAAAATGCTGTAGGA 
ACCTGTGCGGAACTTATATTTGCTCCAATCGATGCTTCGTTTGCGGATGATGCACCTCTG 
CTTCCTTCTGGTTTTCGTATTATCCCTCTTGATTCCGCAAAGGAAGTATCTAGCCCT^AAC 
CGAACCTTGGATCTTGCTTCGGCACTGGAAATTGGTTCAGCTGGAACAAAAGCCTCAACT 
GATCAATCAGGAAACTCCACATGTGCAAGATCTGTGATGACAATAGCATTTGAGTTTGGT 
ATCGAGAGCCATATGCAAGAACATGTAGCATCCATGGCTAGGCAGTATGTTCGAGGTATC 
ATATCATCGGTGCAGAGAGTAGCATTGGCTCTTTCTCCTTCTCATATCAGCTCACAAGTT 
GGTCTACGCACTCCTTTGGGTACTCCTGAAGCCCAAACACTTGCTCGTTGGATTTGCCAG 
AGTTACAGGGGCTACATGGGTGTTGAGCTACTTAAATCAAACAGTGACGGCAATGAATCT 
ATTCTTAAGAATCTTTGGCATCACACTGATGCTATAATCTGCTGCTCAATGAAGGCCTTG 

CCOGTCTraCACATTTGCAAACCA 

CTTCAAGACATCTCTTTAGAGAAGATATTTGATGACAATGGAAGAAAGACTCTTTGCTCT 

GAGTTCCCACAGATCATGCAACAGGGCTTCGCGTGCCTTCAAGGCGGGATATGTCTCTCA 

AGCATGGGGAGACCAGTTTCGTATGAGAGAGCAGTTGCTTGGAAAGTACTCAATGAAGAA 

GAAAATGCTCATTGCATCTGCTTTGTGTTCATCAATTGGTCCTTTGTGTGA 

>G1548 Amino Acid Sequence (domain in AA coordinates: 17-77) 

MAMSCKDGKLGCLDNGKYVRYTPEQVEALERJjYHDCPKPSSIRRQQIjIRECPILSNIEPK 

QIK\WFQNRRCREKQRKEASRLQAWRKLTAMNKLL^ 

TPNPSLPAKDTSCESVWSGQHQIiASQNPQRDASPAGLLSIAEETLAEFLSKATGTAVEW 
VQMPGMKPGPDSIGIIAISHGCTGVAARACGLVGLEPTRVAEIVKDRPSWFRECRAVEVM 
NVLPTANGGTVELLYMQLYAPTTIiAPPRDFWIjIjRYTSVLEDGSLVVCERSLKSTQNGPSM 
PLVQNFVRAEMLS SGYLIRPCDGGGS I IHI VDHMDLEACS VPEVLRPLYESPKVLAQKTT 
MAALRQLKQIAQEVTQTNSSVNGWGRRPAALRALSQRLSRGFNEAWGFTDEGWSVIGDS 

MDDVTITVNSSPDKLMGLITLTFANGFAPVSNVVLCAKA^ 

WADNNIDAYLAAAVKVGPCSARVGGFGGQVILPIiAHTI EHEEFME VI KLEGLGHSPEDAI 
VPRDIFLLQLCSG^ENAVGTC^LIFAPIDASFADDAPLLPSGFRIIPLDSAKEVSSPN 
RTLDLASALEIGSAGTKASTDQSGNSTCARSVMT 

I S S VQRVALALS PSHI S S QVGLRTPLGTPEAQTIiARWI CQS YRGYMGVELLKSNSDGNE S 
I LKNLWHHTDAI I CCSMKALPVFTFANQAGLDMLETTLVALQDI S LEKI FDDNGRKTLC S 
EFPQIMQQGFACLQGGICLSSMGRPVSYERAVAWKVLNEEENAHCICFVFINWSFV* 

>G1574 (1..1962) 

ATGGATGATACAAXGGACATGAGTTCAGGTAGTGATGAAGAAGTACAAGAAGAGAAGACC 
ACTX3TTAACGAGAGGGTCATCTATCAGGCTGCACT 

GAAAAGGATCTACCTCCTGGTGTTCTTACAGTTCCTCTTATGAGGCATCAGAAAATTGCA 
TTGAACTGGATGCGTAAGAAAGAAAAAAGAAGCAGGC71CTGTTTGGGAGGGATATTAGCA 
GATGATCAGGGACTTGGTAAAACGATCTCGACGATCTCTCTTATCCTGTTACAAAAGTTG 
AAGTCACAATCAAAGCAGAGAAAGCGAAAAGGTCAAAACTCTGGTGGTACATTGATTGTT 
TGTCC^GCAAGTGTTGTAAAACAATGGGC^GAGAAGTTAAAGAGAAGGTTTCTGATGAA 

C^CAAACTCTCTGTTTTAGTCCACCATGGA 

GCAATATATGATGTGGTCATGACAACTTACGCCATTGTTACAAATGAAGTTCCACAAAAC 
CCTATGCTGAATCGTTATGATAGTATGAGAGGCAGAGAAAGCCTTGACGGATCGAGTTTG 
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ATTCAGCCTC^CGTTGGTGCACTAGGAAGAGTTAGGTGGTTGAGAGTAGTATTAGATGAA 
GCTC^TACAATTAAAAACCATAGAACCCTAATTC 

AAAAGGAGATGGTGTTTGACTGGAACGCCGATAAAGAACAAAGTAGACGATCTTTATAGC 
TATTTCAGATTTCTTAGATATGATCCATATGCCA^ 

AAAGCTCCAATTGATAAAAAGCCTCTTC^TGGTTACAAGAAGCTTCAAGCTATTCTAAGG 

GGTATAATGTTGCGCCGCACCAAAGAATGGTCTTTCTACAGGAAGCTTGAATTGAATTCA 

CGTTGGAAGTTTGAGGAATATGCTGCTGATGGGACTTTGCATGAACACATGGCTTATCTT 

TTGGTGATGCTTTTGCGACTACGCC^GCTTGTAACCATCCACAACTTGTTAACGGATAT 

AGTCACTCAGATACTACAAGAAAAATGTCAGATGGAGTTCGAGTAGCCCCTAGAGAGAAT 

CTAATCATGTTCCTCGATCTCTTGAAATTATCCTC^CCACCTGCTCTGTTTGTAGTGAT 

CCACCAAAAGACCCTGTTGTTACTTTGTGTGGCCATGTGTTTTGTTATGAGTGTGTGTCT 

GTAAACATTAACGGGGATAACAATACGTGCCCTGCACTTAATTGCCACAGCCAGCTTAAA 

CATGATGTTGTTTTC^CTGAATCTGCAGTTAGAAGTTGC^TCAACGATTATGATGATCCT 

GAAGATAAAAATGCTTTAGXTGCATCAAGGCGAGTTTATTTCATCGAAAATCCGAGCTGT 

GATAGAGATTCTTCAGTCGCTTGCAGAGCAAGGCAGTCC7VGACACTCCACCAATAAAGAC 

AATAGTATCAGTGGACTGAATCTCATTTTTACGTT^ 

GAAACAGGTGCGATGTTGATGTCTCITAAA 

GC^GTCATGTCATTCTACTGGACCTATGGTGGAATCCAACAACAGAGGATC^ 

GATCGAGCTCATCGTATCGGACAAACTCGAGCTGTTACGGTCACTCGTATTGCCATCAAA 

AATACCGTTGAGGAACGAATTTTGACTCTTCATGAACGTAAAAGGAACATTGTTGCATCT 

GCATTGGGTGAAAAAAACTGGCAAAAGTTCTGCGATTCAACTAACACTAGAAGATC 

ATATCTGTTTTTTGGTGTGTAGAATATCCCAGAGTTTTTATTGATAAGAGGAATAAAACC 

TTTAGCTATTTAATAAGTCACAAGTGTGAATGTAATGAATAA 

>G1574 Amino Acid Sequence (domain in AA coordinates: 28-350) 

MDDTMDMSSGSDEEVQEEKTTVNERVIYQAALQDLKQPKTEKDLPPGVLWPLMRHQKIA 

LNWMRKKEKRSiraCLGGIItADDQGL^ 

CPASVVKQWAREVIOSKVSDEHKLSVLVHHGSHRT 

PMLiTRYDSMRGRESLDGSSLIQPHVGALGRWWLRVVI^ 

KRRWCLTGTP I KNKVDDL YS YFRFLRYHP YAMCNS FHQRI KAP IDKKPLHGYKKLQAI LR 

GIMLRRTKEWSFYRKLELNSRWKFEEYAADGT^^ 

SHSDTTRKMSDGWVAPRENLIMFLDLLKIiSST^ 

VNINGDNNTCPALNCHS QLKHDWFTESAVRS C INDYDDPEDKNALVASRRVYFIENP S C 
DRDSSVACRARQSRHSTNKDNSISGLIWjIFTFLKD 

ASHVI LLDLWWNPTTEDQAIDRAHRI GQTRAVTVTR IAI KNTVEERI LTLHERKRNI VAS 
ALGEKNWQKFCDSTNTRRSRISVFWCVEYPRVFIDKRNKTFSYLISHKCECNE* 
>G1586 (1..807) 

ATGAATCAAGAAGGTGCTTCACATAGCCCATCCTCCACTTCCACCGAACCAGTCCGGGCA 

CGTTGGTCACCTAAACCGGAGCAAATCTTGATACTCGAATCCATCTTCAACAGTGGTACT 

GTTAACCCACCAAAAGATGAAACGGTGAGGATAAGAAAGATGCTTGAGAAATTCGGTGCT 

GTGGGAGACGCAAACGTCTTCTACTGGTTTCAAAACCGACGGTCAAGATCTCGCCGGAGA 

CACCGGCAGCTTTTAGCAGCCACCACCGCAGCCGCCACCTCCATAGGAGCTGAAGACCAC 

CAGCACATGACGGCCATGAGCATGCATCAATATCCTTGCAGCAACAAC 

GGGTTTGGAAGTTGTAGCAACTTATCAGCTAATTACTTCCTTAATGGATCGTCGTCATCT 

CAAATCCCTTCCTTTTTCCTCGGCCTCTCTTCTTCAAGTGGTGGGTGTGAGAACAACAAT 

GGTATGGAGAATCTCTTCAAAATGTATGGCCATGAATCTGATCATAATCATCAGCAGCAG 

CATCATAGCTGZUVATGCTGCATCAGTTTTAAAC^ 

TACGAACAAGAAGGGTTTATGATOGTGTTTATAAACGGAGTTCCTATGGAAGTAA(^AAA 
GGAGCAATAGACATGAAAACAATGTTCGGTGATGATTCGGTGTTACTTCATTCCTCTGGT 
CTTCCTCTTCCCACTGATGAGTTTGGTTTCTTGATGCATTCTTTACAACATGGACAAACT 
TATTTCCTGGTACCGAGACAGACATGA 

>G1586 Amino Acid Sequence (domain in AA coordinates: 21-81) 

MNQEGASHSPSSTSTEPVRARWSPKPEQILIIjESIFNSGTVNPPKDETVRIRKMLEKFGA 

VGDANVFYWFQNRRSRSRRRHRQLLAATTAAATS I GAEDHQHMTAMSMHQYPCSNNE IDL. 

GFGSCSNLSANYFLNGSSSSQIPSFFLGLSSSSGGCEl^GMENLFKiyr^GHESDHNHQQ 

HHSSNAASVLNPSDQNSNSQYEQEGFMTVFINGVPMETO 

LPLPTDEFGFLMHSLQHGQTYFLVPRQT* 

>G1786 (1..1170) 
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ATGATCGTGTACGGTGGGGGAGCATCCGAGGACGGTGAAGGTGGAGGGGTGGTTCTGAAG 

AAAGGGCCATGGACGGTGGCCGAGGACGAGACACTGGCGGCTTACGTACGGGAATACGGT 

GAAGGGAACTGGAATTCTGTTCAGAAGAAGACATGGCTGGCTAGGTGTGGCAAGAGCTGC 

CGCCTCCGCTGGGCTAACCACTTACGACCTAATCTCAGGAAAGGCTCCTTCACCCCCGAG 

GAAGAACGTCTCATCATACAAOTCCACTCTCAGCTAGGCAACAAATGGGCTCG 

GCTCAGTTACCAGGCAGAACAGATAACGAGATCAAGAACTACTGGAACACGAGGTTGAAA 

CGCTTCCAACGCCAAGGCCTCCCTCTCTACCCTCCAGAATATTCCCAAAACAATCATCAA 

C^CAAATGTATCCTC^CAGCCCTCCTCACCTCTCCCGTCCCAAACACCTGCTTCTTCC 

TTTACCTTTCCTCTCCTCCAACCGCCTTCTCTGTGTCCCAAACGTTGTTATAACACTGCC 

TTCTCTCCCAAGGCCTCATATATTTCTTCT^ 

TTTCTTCAC^CCCATTCCTCTCTTTCCTCCTATCAGTCTACCAATCCGGTTTACTCCA^ 
AAACATGAGCTCTCTTCAAACCAAATTCCATACTCTC 

AGCAAGTTCTCAGACAATGGGGATTGTAACCAAAACCTGAACACCGGTTTGCATACAAAT 
ACCTGTCAGCTGTTAGAGGATCTTATGGAGGAGGCCGAGGCTCTAGCTGATAGCTTTCGT 
GCTCCTAAGCGGAGACAAATCATGGCTGCGCTTGAGGACAACAACAACAACAACAACTTT 

TTCTCGGGAGGTTTCGGACATCGTGTTTCI^C 

ACACCAAAGGAAGATGAGTCTCTCCAGATGAACACAATGCAAGATGAGGACATAACAAAG 
CTTCTTGACTGGGGAAGTGAAAGTGAAGAAATCTCAAACGGGCAATCCTCTGTGATAACA 
ACAGAGAACAACCTTGTCCTTGACGATCACC^GTTCGCTTTTCTGTTTCCAGTTGATGAT 
GACACCAACAACTTGCCAGGGATCTGCTAG 

>G1786 Amino Acid Sequence (domain in AA coordinates: TBD) 

MIVYGGGASEDGEGGGVVLKKGPWTVAEDETI^ 

RLRWANHLRPNLRKGSFTPEEERLIIQLHSQLGNKWARM^ 

RFQRQGLPLYPPEYSQNNHQQQMYPQQPSSPLPSQTPASSFTFPLLQPPSLCPKRCYNTA 
FSPKASYISSPTNFLVSSPTFLHTHSSLSSYQSTNPVYSMKHELSSNQIPYSASLGVYQV 
SKFSDNGDCNQNLNTGLHTNTCQLLEDLMEEAEALADSFRAPKRRQIMAALE 
FSGGFGHRVSSNSLCSLQGLTPKEDESLQMNTMQDEDITKLLDWGSESEEISNGQSSVIT 

TENWLVLDDHQFAFLFPVDDDTNNLPG I C * 
>G1792 (77.. 496) 

AATCC^TAGATCTCTTATTAAATAACAGTGCTGACC^^GCTCTTAC^AAGCAAACCAATC 

TAGAACACCAAAGTTAATGGAGAGCTCTU^CAGGAGCAGCAACAACCAATCA 

CAAGCT^GCTCGTTTCCGGGGAGTTCGAAGAAGGCCTTGGGGAAAGTTTGCAGCAGAGAT 

TCGAGACCCGTCGAGAAACGGTGCCCGTCTTTGGCTCGGGACATTTGAGACCGCTGAGGA 

GGCAGCAAGGGCTTATGACCGAGCAGCCTTTAACCTTAGGGGTCATCTCGCTATACTCAA 

CTTCCCTAATGAGTATTATCCACGTATGGACGACTACTCGCTTCGCCCTCCTTATGCTTC 

TTCTTCTTCGTCGTCGTC^TCGGGTTCAACTTCTACTAATGTGAGTCGACAAAACCAAAG 

AGAAGTTTTCGAGTTTGAGTATTTGGACGATAAGGTTCTTGAAGAACTTCTTGATTCAGA 

AGAAAGGAAGAGATAATC^CGATTAGTTTTGTTTTGATATTTTATGTGGCACTGTT^ 

CTACCTACGTGCATTATGTGCATGTATAGGTCGCTTGATTAGTACTTTATAACATGCATG 

CCACGACCATAAATTGTAAGAGAAGACGTACTTTGCGTTTTCATGAAATATGAATGTTAG 

ATGGTTTGAGTACAAAAAAAAAAAAAAAAAAAAAAA 

>G1792 Amino Acid Sequence (domain in aa coordinates: 17-85) 
MESSNRSSNNQSQDDKQARFRGVRRRPWGKFAAEIRDPSRNGARLWLGTFETAEEAARAY 
DRAAFNLRGHLAI LNFPNE YYPRMDD YSLRP P YAS S S S S S S SGSTSTNVSRQNQREVFE F 
EYLDDKVLEELLDSEERKR* 
>G1865 (48.. 899) 

AAGAAGAGGACATGAAGCACAGAGATTCTGCAGACTGCAGGTGACCAATGGACACITTA^ 
CAATAAAAACATAC^TACTACTCTCTTACACTTTC 

TTAATCTCTCTTTCTTCTTCATCTCTCTTTCTCTTTCTCTCTTCATGGCTACAAGGATTC 
CATTCACAGAATCACAATGGGAAGAACTTGAAAACCAAGCTCTTGTGTTCAAGTACTTAG 
CTGCAAATATGCCTGTTCCACCTCATCTTCTCTTCCTCATCAAAAGACCCTTTCTCTTCT 
CTTCTTCTTCTTCTTC^TCTTCTTCTTC^GCTTCTTCTCrCCC^CTCTTTCTCCACACT 
TTGGGTGGAATGTGTATGAGATGGGAATGGGAAGAAAGATAGATGCAGAGCCAGGAAGAT 
GTAGAAGAACTGATGGCAAGAAATGGAGATGCTCTAAAGAAGCTTACCCTGACTCTAAGT 
ACTGTGAGAGACATATGCATAGAGGCAAGAACCGTTCTTCCTCAAGAAAGCCTCCTCCTA 
CTCAATTCACTCCAAATCTCTTTCTCGACTCTTCTTCCAGAAGAAGAAGAAGTGGATAC^ 
TGGATGATTTCTTCTCCATAGAACCTTCCGGGTCAATCAAAAGCTGCTCTGGCTCAGCAA 
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TGGAAGATAATGATGATGGCTCATGTAGAGGCATCAACAACGAGGAGAAGCAGCCGGATC 
GACATTGCTTCATCCTTGGTAOTGACTTGAGGACACGTGAGAGGCCATTGATGTTAGATC 
AGAAGCTGAAACAAAGAGATCATGATAATGAAG7VAGAGCAAGGAAGCAAGAGGTTTTATA 
GGTTTCTTGATGAATGGCCTTCTTCTAAATCT^ 

CATCTTTTGTTCTTATAACCTTGTATTTCTTGTTAAGATGGTAATGCAAATT 

>G1865 Amino Acid Sequence (domain in AA coordinates: 124-149) 

MDTLSIKTYLLLSYTFNFPIQIPIPNLSFFFISLSLSLFMATRIPFTESQWEELENQALV 

FKYLAANMPVPPHLLFLIKRPFLFSSSSSSSSSSSFFSPTLSPHFGWNVYEMGMGRKIDA 

EPGRCRRTDGKKWRCSKEAYPDSKYCERHMHRGKNRSSSRKPPPTQFTPNL 

RSGYMDDFFSIEPSGSIKSCSGSAMEDNDDGSCRGINNEEKQPDRHCFILGTDLRTRERP 

LMLEEKLKQRDHDNEEEQGSKRFYRFLDEWPSSKSSVSTSLFI* 

>G1886 (43.. 909) 

AGGAAACATAAGTAATCGTTGCTTCGATCCTTTGTTACATGGATGGATCCTGAACAGGAA 
ATCTCAAACGAGACTTTGGAAACTATATTGGTAAGTTCAACAAAAGGAAGCAATAATAAC 
AATAAGAAAATGGAAGAAGAAATGAAGAAGAAAGTATCAAGAGGAGAATTAGGAGGTGAA 
GCTCAAAATTGTCCAAGATGTGAATCTCCAAACACAAAGTT^ 

AGTCTCTCACAACCTCGTTACTTCTGCAAATCTTGTCGGAGATATTGGACTAAAGGCGGT 
ACTCTTCGTAACGTTCCCGTCGGTGGTGGTTGCCGTCGAAACAAACGATCCTCTTCCTCA 
GCTTTCTCCAAGAACAACAACAATAAGTCTAT^ 

CCTTTAATTACGGGAATGCCACCATCATCTTTTGGTTATGATCACTCCATTGATCTCAAC 
CTCGCTTTCGCTACTCTCCAAAAGCATCATTTATCCTCTCAAGCTACTACGCCTTCTTTT 
GGGTTTGGAGGTGATCTTTCTATTTATGGAAACTCAACGAATGATGTAGGGATCTTCGGA 
GGGCAAAACGGTACTTATAACAATAGTTTGTGTTATGGGTTTATGTCCGGAAATGGTAAT 
AATAATCAAAATGAAATCAAGATGGCTTCTACATTGGGGATGTCTTTGGAAGGAAACGAG 
AGAAAGCAAGAGAATGTGAACAATAACAATAATAACTCAGAGAATCCTAGCAAGGTGTTC 
TGGGGGTTTCCATGGCAGATGACCGGAGATTCCGCCGGAGTTGTACCGGAGATTGATCCC 
GGAAGGGAAAGCTGGAATGGGATGGTTTCATCTTGGAATAATGGTTTACTCAACACTCCT 
TTGGTCTAGCAGATCATTAA 

>G1886 Amino Acid Sequence (domain in aa coordinates: 17-59) 

MDPEQEISNETLETILVSSTKGSNNNNK^ 

CTYNireSLSQPRYFCKSCRRYWTKGGTLRNVPVGG 

TDPLQNPLITGMPPSSFGYDHSIDLNLAFATIiQKHHLSSQATTPSFGFGGDLSIYGNSTN 
DVG IFGGQNGTYNNSLCYGFMSGNGNNNQNE I KMASTLGMS LEGNERKQENVNNNNNNS E 
NPSKVFWGFPWQMTGDSAGWPEIDPGRESWNG^SSWNNGLLNTPLV* 
>G1933 (33.. 1418) 

AATTGAGATTAAAGTAATTTATCTTTCAGAAAATGGCGGTTGAAGACGATGTATCTTTGA 
TAAGAACGACGACGTTAGTGGCACCAACAAGACCCACGATTACAGTTCCTCATAGACCTC 
CGGCGATCGAAACGGCGGCGTATTTCTTTGGCGGTGGAGATGGGCTTAGTCTAAGCCCAG 
GGCCACTTTCTTTTGTCTCTTCTTTGTTTGTTGATAACTTCCCTGACGTCTTGACGCCGG 
ATAACCAACGGACGACGTCGTTTACTCAGCTTCTTAACGGAACTATGTCGGTGTCTCCTG 
GTGGCGGAGGACGTTCAACGGCGGGGATGTTCGCCGGAGGAGGTCCGATGTTTACAATCC 
CTTCTGGTTTCAGCCCTTCTAGTCTTCTCACCTCGCCCATGTTCTTTCCCCCGCAGTCGT 
CAGCTCATACCGGCTTTATTCAACCACGGCAGCAGTCACAACCGCAACCACAACGACGAG 
ACACGTTTCCTCACCATATGCCACCATCGACATCCGTCGCCGTCCATGGTCGTCAATCTT 
TAGACGTTTCACAAGTAGATCAAAGAGCTCGAAACCATTATAATAATCCGGGGAATAACA 
ATAATAACCGGTCGTATAACGTTGTGAACGTTGATAAACCGGCGGATGACGGTTATAACT 
GGAGGAAGTACGGACAAAAGCCTATCAAAGGGTGTGAATATCCAAGGAGTTATTACAAAT 
GTACACATGTTAACTCTCCGGTGAAGAAGAAAGTCGAACGGTCATCGGATGGACAGATCA 
CTCAGATCATTTACAAAGGTCAACATGATCACGAGAGGCCTCAGAATCGCCGTGGCGGTG 
GAGGCAGAGATTCCAGTGAGGTTGGTGGTGCAGGGCAAATGATGGAATCTAGTGATGATA 
GTGGTTATCGTAAGGATCATGATGATGATGATGATGATGATGAAGATGATGAAGATCTTC 
CGGCTTCAAAGATAAGAAGAATAGACGGTGTGTCGACGACTCACCGGACGGTGACCGAGC 
CTAAGATTATCGTTCAGACAAAAAGTGAAGTCGATCTTCTCGACGATGGCTATAGGTGGC 
GTAAGTACGGACAAAAAGTTGTCAAAGGAAATCCCCATCCAAGGAGCTATTATAAATGTA 
CAACGCCAAATTGTACGGTCCGTAAACATGTAGAGAGAGCTTCCACGGATGCTAAGGCTG 
TGATTACAACTTACGAAGGTAAACACAATCACGATGTCCCTGCCGCTAGAAACGGTACCG 
CGGCAGCAACCGCAGCTGCGGTGGGGCCGTCTGACCACCATCGTATGAGATCAATGTCGG 
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GGAACAATATGCAACAACATATGAGTTTCGGTAACAATAATAACACAGGCCAATCTCCGG 
TTCTTTTGAGGTTGAAAGAAGAGAA7VATCACAATTTGACTTTTAAGAACCAAAGATTTCG 

AGATTGATATT 

>G1933 Amino Acid Sequence (conserved domain in AA coordinates :205-263 , 
MAV^DDVSLIRTTTLVAPTRPTITVPHRPPAIETAAYFFGGGDGLSLSPGPLSFVSSLFV 
DNFPDVLTPDNQRTTSFTQLLNGTMSVSPGGGGRSTAGMFAGGGPMFTI PSGFS PS SLLT 
SPMFFPPQSSAHTGFIQPRQQSQPQPQRPDTFPHHMPPSTSVAVHGRQSLDVSQVDQRAR 

NHYNNPGN1INNNRSYNWNV1DKPADIX3YNTO 

VERSSDGQITQIIYKGQHDHERPQNRRGGGGRDSTEVGGAGQMMESSDDSGYRKDHDDDD 
DDDEDDEDLPASKIRRIIX3VSTTHRTVTEPKIIVQTKSEVDLLDDGYRWRKYGQCTVKGN 
PHPRSYYKCTTPNCTTOKHV13RASTDAKAVITT 
DHHRMRSMSGNNMQQHMSFGNNNNTGQSPVLLRLKEEKITI * 
>G2059 (58.. 1089) 

TTAAGAACAGGCTTCATTCTCTGGACAAACACTCAAAAAACAAACAAAAAAA 

GAAGATGAGTTTCCTAAAATAGAAACTAGCTTCATGCACGACAAGCTCTTGTCT^ 

ATCTACGGGTTCTTGAGTTCTTCGACGCCGCCACAACTTCTCGGTGTTCCAATATTTTTG 

GAAGGTATGAAATCTCCTCTTCTTCCTGCTTCOTCGACTCCGAGCTACTTTGTGTCGCCT 

CATGATCATGAGCTCACATCTTCTATTCATCC^TCTCCGGTAGCTTCTGTTCCTTGGAAC 

TTTCTAGAATCTTTTCCTCAGTCTC3^ 

CTTACTTTGTTCCTTAAAGAACCAAAGCTACTAGAACTTTCTCAATCCGAAAGC^CATG 

AGCCCTTACCATAAATACATCCCAAACTCCTTTTATCAATCAGA CCAAA ACAGAAACGAA 

TGGGTAGAGATCAATAAAACTCTAACCAACTATCCCTCGAAAGGTTTTGGAAACTATTGG 

CTAAGTACCACCAAGACTCAACCCATGAAGTCAAAAACAAGAAAGGTTGTTCAGACGACG 

ACCCCAACAAAACTGTATAGAGGAGTGAGACAAAGACACTGGGGCAAATGGGTCGCAGAG 

ATTAGGCTTC(^GGAAC^GAACCCGTGTTTGGCTCGGCACTTTTGAAACCGCTGAGC^ 

GCAGCAATGGCTTACGATACAGCAGCTTATATCCTTCGTGGCGAATTCGCACACCTCAAC 

TTTCCTGATCTTAAACACCAGCTCAAGTCCGGTTCTTTGCGATGCATGATCGCCTCACTT 

TTGGAGTCCAAGATTCAACAGATCTCATCTTCCCAAGTAAGTAACTCTCCTTCTCCTCCT 

CCTCCAAAAGTGGGAACACCGGAGCAAAAGAATCATCACATGAAGATGGAGTCAGGAGAA 

GACGTGATGATGAAGAAACAGAAAAGCCATAAGGAAGTGATGGAAGGAGATGGTGTACAA 

TTGAGTAGGATGCCTTCTTTGGATATGGATCTCATTTGGGATGCTCTCTCATTTCCTCAT 

TCTTCTTGACTTCAAATTAATATTTGT 

TATCAAAAGTTTCCACCAAAGAAAGAA 

TGGGGTTGAACACATTGTAATTCTTCTTACGACCACATAATCAAGTGGTTCTCCTTTTTT 
TGTCTGCTAA 

>G2059 Amino Acid Sequence (conserved domain in AA coordinates : 184-254) 
MEDQFPKIETSFMHDKLLSSGIYGFLSSSTPPQLLGVPIFLEGMKSPLLPASSTPSYFVS 
PHDHELTSSIHPSPVASVPWNFLESFPQSQHPDHHPSKPPNLTLFLKEPKLLELSQSESN 
MS PYHKYI PNSFYQSDQNRNEWVEINKTLTNYPSK^ 

TTPTKLYRGWQRHWGKWVAEIRLPRJ^TRVTOiGTFETAEQAAMAYDTAAYILRGEF^ 
NFPDLKHQLKSGSLRCMIASLLESKIQQISSSQVSNSPSPPPPKVGTPEQKNHHMKMESG 
ED VMMKKQKSHKEVMEGDGVQLS RM PS LDMDL I WDALS FPHS S * 
>G2105 (42.. 1487) 

CTCTCTGACTTGAACTCTTCTCTTCTACCGAATCAAACCAAATGGAGGATCATCAAAACC 
ATCCACAGTACGGTATAGAACAACCATCTTCTCAATTCTCCTCTGATCTCTTCGGCTTCA 
ACCTCGTTTCAGCGCCGGACCAGCACCATCGTCTTCATTTCACCGACCATGAGATAAGTT 
TATTGCCACGTGGAATACAAGGGCTTACGGTGGCTGGAAACAACAGTAACACTATTACAA 
CGATCCAGAGTGGTGGCTGTGTTGGTGGGTTTAGTGGCTTTACGGACGGCGGAGGAACAG 
GGAGGTGGCCGAGGCAAGAGACGTTGATGTTGTTGGAGGTCAGATCTCGTCTTGATCACA 
AGTTCAAAGAAGCTAATCAAAAGGGTCCTCTCTGGGATGAAGTTTCTAGGATTATGTCGG 
AGGAACATGGATACACTAGGAGTGGCAAGAAGTGTAGAGAGAAGTTCGAGAATCTCTACA 
AGTACTATAAAAAAACAAAAGAAGGCAAATCCGGTCGGCGACAAGATGGTAAAAACTATA 
GATTTTTCCGGCAGCTTGAAGCGATATACGGCGAATCCAAAGACTCGGTTTCTTGCTATA 
ACT^CACGC^GTTCATAATGACCAATGCTCTTCATAGTAATTTCCGCGCTTCTAACATTC 
ATAACATCGTCCCTCATCATCAGAATCCCTTGATGACCAATACCAATACTCAAAGTCAAA 
GCCTTAGCATTTCTAACAATTTCAACTCCTCCTCCGATTTGGATCTAACTTCTTCCTCTG 
AAGGAAACGAAACTACTAAAAGAGAGGGGATGCATTGGAAGGAAAAGATCAAGGAATTCA 
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TTGGTGTTCATATGGAGAGGTTGtt 

AGATTGTGGAAGACAAAGAACATCAAAGGATGCTGAGAGAAGAGGAATGGAGAAGGATTG 
AAGCGGAAAGGATCGATAAGGAACGTTCGTTTTGGACAAAAGAGAGGGAGAGGATTGAAG 
CTCGGGATGTTGCGGTGATTAATGCCTTGCAGTACTTGACGGGAAGGGCATTGATAAGGC 
CGGATTCTTCGTCTCCTACAGAGAGGATTAATGGGAATGGAAGCGATAAAATGATGGCTG 
ATAATGAATTTGCTGATGAAGGAAATAAGGGCAAGATGGATAAAAAACAAATGAATAAGA 
AAAGGAAGGAGAAATGGTCAAGCCACGGAGGGAATCATCCAAGAACCAAAGAGAATATGA 
TGATATACAACAATCAAGAAACTAAGATTAATGATTTTTGTCGAGATGATGACCAATGCC 
ATCATGAAGGTTACTCACCTTCAAACTCCAAGAACGCAGGAACTCCGAGCTGCAGCAATG 
CCATGGCAGCTAGTACAAAGTGCTTTCCATTGCTTGAAGGAGAAGGAGATCAGAACTTGT 
GGGAGGGTTATGGTTTGAAGCAAAGGAAAGAAAATAATCATCAGTAAGCTACATTTTTCA 
TTCTCAAAATGAAGAATAAGAGAACTTAGAAACGAT 

>G2105 Amino Acid Sequence (domain in AA coordinates: 100-153) 
MEDHQNHPQYGIEQPSSQFSSDLFGFlHjVSAPDQHHRIiHFTDHEISIiLPRGIQGLTVAGN 

NSNTITTIQSGGCVGGFSGFTDGGGTGRWPRQETL^^ 

VSRIMSEEHGYTRSGKKCREKFENLYKYYKKTKEGKSGRRQDGKNYRFF 

DSVSCYNNTQFIMTNALHSNFRASNIHNIVPHHQNPIiMTNTNTQSQSL 

DLTS S SEGNETTKREGMHWKEKI KEF IGVHMERLI EKQDFWLEKLMKI VEDKEHQRMLRE 

EEVHlRIEAERIDKERSFWTKERERIEARDVAVINALQYIiTGRALIRPDSSSPTERINGNG 

SDKMMADNEFADEGNKGKMKKQMNKl^ 

RDDDQCHHEGYS PSNS KNAGTPSCSNAMAASTKCFPLLEGEGDQNLWEGYGLKQRKENNH 
Q* 

>G2117 (49.. 465) 

ATACTTGTCAACAAAAATTTTCTTAAAGAACGCATAACTGTTTTTTTCATGGCTGGTTCT 
GTCTATAACCTTCCAAGTCAAAACCCTAATCCACAGTCTTTATTCCAAATCTTTGTTGAT 
CGAGTACCACTTTCAi\ACTTGCCTGCCACGTCAGACGACTCTAGCCGGACTGCAGAAGAT 
AATGAGAGGAAGCGGAGAAGGAAGGTATCGAACCGCGAGTCAGCTCGGAGATCGCGTATG 
CGGAAACAGCGTCACATGGAAGAACTGTGGTCCATGCTTGTTCAACTCATCAATAAGAAC 
AAATCTCTAGTCGATGAGCTAAGCCAAGCCAGGGAATGTTACGAGAAGGTTATAGAAGAG 
AACATGAAACTTCGAGAGGAAAACTCCAAGTCGAGGAAGATGATTGGTGAGATCGGGCTT 
AATAGGTTTCTTAGCGTAGAGGCCGATCAGATCTGGACCTTCTAATCGTCTCGTAAGCTT 

GTTGGTTTTTTGTTGTTTATTTAAAG 

>G2117 Amino Acid Sequence (conserved domain in AA coordinates -.46-106) 

MAGSVYNLPSQNPOTQSLFQIFVI)RVTLSNLP 

RSRMRKQRHMEELWSMLVQLINKNKSLTO^ 

EIGLNRFLSVEADQIWTF* 

>G2124 (87.. 923) 

GAACAGCAAAACCCTAGATTTCCTGTTCAAGCTCAAGACCGTACAAAACTTTGGAACTCA 
TATATAAAGATCTCGAGAATAGCATTATGAATATCGTCTCTTGGAAAGATGCAAACGACG 
AAGTTGCAGGCGGCGCTACGACAAGACGTGAAAGAGAAGTAAAAGAGGATCAAGAAGAAA 
CCGAAGTCAGAGCCACCAGTGGCAAAACCGTAATTAAAAAGCAGCCTACATCGATCTCTT 
CTTCTTCTTCTTCGTGGATGAAATCCAAGGATCCGAGGATTGTTAGGGTTTCACGCGCCT 
TTGGAGGCAAAGACCGTCACAGCAAAGTGTGTACGTTACGTGGACTACGTGACAGACGCG 
TGAGATTATCAGTCCCAACGGCTATTCAGCTCTACGATCTTCAAGAACGGCTCGGTGTTG 
ACCAGCCTAGCAAAGCCGTTGACTGGTTGCTTGATGCAGCTAAAGAGGAGATCGACGAGC 
TACCTCCGTTACCTATCTCGCCGGAAAATTTCAGCATCTTCAACCATCATCAGTCCTTCT 
TGAATCTTGGTCAACGGCCCGGTCAAGATCCGACCCAACTCGGGTTTAAAATCAATGGAT 
GTGTACAAAAGTCTACTACTACTAGCCGCGAAGAAAACGATAGAGAGAAAGGAGAAAACG 
ATGTCGTTTACACAAACAATCATCATGTTGGGTCTTATGGAACTTATCACAACCTGGAAC 
ATCATCT^TCATCATCACCAACATTTGAGTTTACAGGCAGATTATCATAGTCATCAACTAC 
ATAGTCTTGTCCCATTTCGATCACAAATTTTGGTATGTCCAATGACGACATCACCAACAA 
CTACAACTATACAATCTTTGTTTCCATCATCATCGTCAGCTGGTTCAGGGACTATGGAGA 
CATTAGATCCGAGGCAAATGTAGCAACAATGGTGGTAGAGACATTGATAATCGGATGTCG 
TCGGTCCAATTCAACCGAACTAATAGCACTACAACGGCTAACATGTCGAGGCATCTAGGC 
TCGGAGCGTTGTACAAGTAGAGGAAGTGATCACCATATGTGAAGTTAGATTATTGAAACG 
ATATAATTGTTGTTTGATGTGTTCAGAAATAAGGGGACAC 

>G2124 Amino Acid Sequence (domain in AA coordinates: 75-132) 
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MNIVSWKDANDEVAGGATTRREREVKEDQEETEVRATSGKTVIKKQPTSISSSSSSW 

KDPRIVRVSRAFGGKDRHSKVCTLRGLRDRRWLSVPTAIQLYDLQERLGVDQPSKAVDW 

LLDAAKEEIDELPPLPISPENFSIFNHHQSFLNLGQRPGQDPTQLGFKINGCVQKSTTTS 

REEITOREKGENDVVYTNNHHVGSYGTYI^ 

ILVCPMTTSPTTTTIQSLFPSSSSAGSGTMETLDPRQM* 

>G2140 {148.. 1254) 

ACTCTCTTAACTTTCGTTTCTTCTCCTACCTTCTTTTACCAACCTTTCCTTTCTCTTACA 
CACATATATATATACATATATAGAGAGAGAGAAGAGGACAAAGAGTTGAAAGATGAAGAC 
TCTCATGTCTTCATAGAAACAAGTGATATGTGCGCTAAGAAAGAAGAAGAAGAAGAAGAA 
GAAGAAGACAGTTCTGAAGCCATGAACAACATACAAAATTACCAAAATGACCTCTTCTTT 
CACCAACTCATCTCTCATCATCACCATCATCATCATGATCCOT 

GGAGCATCCGGTAACGTTGGATCTGGTTTCACTATCTTCTCTCAAGATTCCGTCTCTCCA 

ATATGGTCTCTACCTCCACCTACCTCGATCCAACCACCATTTGATCAGTTTCCTCCTCCT 

TCTTCTTCTCCAGCATCTTTCTACGGAAGTTTCTTCAACAGAAGTCGAGCT 

GG ATTACAGTTTGGGTACGAGGGTTTTGGTGGAG C CACGTCAGCAG CACATCAT CATC AT 

GAACAACTTCGGATCTTGTCGGAAGCTTTAGGTCCGGTAGTACAAGCCGGGTCCGGTCCT 

TTTGGGTTACAAGCTGAGTTAGGGAAGATGACAGCACAAGAGATC^ 

TTGGCTGCTTCAAAGAGTC7VTAGTGAAGCTGAGAGAAGAAGAAGAGAGAGAATCAATAAT 

CATCTCGCTAAGCTCCGTAGCATATTACCCAACACCACaU^CGGATAAAGCGTCGTTA 

CTAGCTGAAGTGATCCAACATGTGAAAGAGTTGAAGAGAGAGACTTCAGTGATCTCAGAG 

ACAAATCTTGTCCCAACGGAAAGCGATGAGTTAACGGTAGCTTTCACGGAGGAGGAAGAA 

ACCGGAGATGGCAGATTTGTAATTAAAGCGTCGCTTTGCTGTGAAGACAGGTCGGATCTC 

TTGCCTGACATGATTAAAACATTGAAAGCTATGCGTCTCAAAACGCTCAAGGCGGAGATA 

AC(^CCGTTGGGGGACGAGTCAAGAACGTTTTGTTTGTTACCGGAGAAGAGAGCTCCGGT 

GAGGAAGTGGAGGAAGAGTACTGTATAGGGACGATTGAGGAAGCTTTGAAAGCGGTGATG 

GAGAAGAGCAATGTAGAGGAATCATCTTCTTCTGGAAATGCTAAGAGACAGAGAATGAGT 

AGTCACAACACTATCACTATCGTCGAACAACAACAACAATATAATCAGAGGTAATCAATT 

TTTTACTTAAATCGCTTTTTTTTCTTACTTTCGGTGTATCTACTACGTGTGTTGTTTGCT 

GGTTATGGAAATGAATGTTGTACGTCACGTTATACTATAGATATATGTGTGTTTGTGTGT 

ATGTATAACGGAAGTATTTGTATCCGTTGTGGTCTTGGACTTTTGGTTTGGTTCTAAGAT 

ACTTATTTTTAAAAACTTGTATCGTTGAGTTGGTTTTCTAGATATGCTTAATGGGAGTAT 

GTGACGAAAAAAAA 

>G2140 Amino Acid Sequence (domain in AA coordinates : 167-242) 

MCAKKEEEEEEEEDSSEAMNNIQNYQNDLFFHQLISHHHHHHHDPSQSETLGASGNVGSG 

FTIFSQDSVSPIWSLPPPTSIQPPFDQFPPPSSSPASFYGSFFNRSRAHHQGLQFGYEGF 

GGATSAAHHHHEQLRILSEALGPWQAGSGPFGLQAELGKMTAQEIMDAKALAASKSHSE 

AERRRRERIMNHLAKLRSILPNTTKTDKASLLAEVIQHVKELKM 

ELTVAFTEEEETGDGRFVIKASLCCEDRSDLLPDMIKTLKAMRLKTIjKAEITTVGGRVKN 

VLFVTGEESSGEEVEEEYCIGTIEEALKAVMEKSNVEESSSSGNAKRQRMSSHNTITIVE 

QQQQYNQR* 

>G2144 (102.. 1241) 

ATTAGGGTTTTGTTGTCGTGAGATTTGATTAC^CAAATTGCTGAATTTGGTTTCGATTAT 
TGGTGTTATTGTTTTCGAAGATTTCCAGTGAGTTTCCGTTTATGGATCTGACTGGAGGAT 
TTGGAGCTAGATCCGGCGGTGTTGGACCGTGCCGGGAACCAATAGGCCTTGAATCGCTAC 
ATCTCGGTGACGAATTTCGGCAACTAGTGACGACTTTACCTCCCGAGAACCCCGGCGGTT 
CGTTCACGGCTTTGCTTGAGCTTCCACCTACACAAGCAGTGGAGCTTCTCCATTTCACTG 
ATTCTTCGTCTTCTCAACAAGCGGCAGTGACAGGGATCGGTGGAGAGATTCCTCCGCCGC 
TTCACTCTTTCGGTGGGACATTGGCTTTTCCTTCTAACTCAGTTCTCATGGAGCGAGCAG 
CTCGTTTCTCGGTGATTGCCACTGAGCAACAAAACGGAAATATCTCCGGGGAGACTCCGA 
CGAGCTCTGTACCTTGCAATTCAAGTGCTAATCTCGACAGAGTCAAGACGGAGCCTGCTG 
AGACCGATTCATCTCAGCGGTTGATTTCTGATTCAGCGATTGAGAATCAAATCCCTTGCC 
CTAACCAGAACAATCGAAATGGGAAGAGGAAAGATTTCGAAAAGAAGGGTAAAAGCTCGA 
CGAAGAAGAACAAAAGCTCTGAAGAGAACGAGAAGCTGCCATATGTTCACGTTAGAGCTC 
GTCGTGGTCAAGCAACCGATAGCCATAGCTTAGCAGAACGAGCAAGAAGAGAGAAGATAA 
ATGCACGAATGAAGCTGTTACAGGAACTGGTCCCAGGCTGTGATAAGATTCAAGGTACCG 
CGCTGGTGCTGGATG7VAATCATTAACCATGTCCAGTCATTACAACGTCAAGTGGAGATGC 
TATCAATGAGACTTGCTGCGGTAAACCCCAGAATCGACTTCAATCTCGACACCATATTGG 
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CITCAGAAAACGGTTCTTTAATGGATC^ 

GGCCTCAGCAAGCCATTGAGACCGAACAGTCCTTTCATCACCGGCAACTGCAACAACCAC 

CAACACAACAATGGCCTTTTGACGGCTTGAACCAGCCGGTATGGGGAAGAGAAGAGGATC 

AAGCTCATGGCAATGATAACAGCAATTTGATGGGAGTTTCTGAAAATGTAATGGTGGCTT 

CTCCTAATTTGCACCCAAATCAGGTCAAAATGGAGCTGTAAGTTGGGAAAACG 

TCATGAATGTGTATATACATCGTATAAGCTCX3TTTCTCrCTATATAAATATAATCATAAA 

TATAGATATCTGTTAAGAAGGTATCAGTCATTTGATTCAGAGAGACAACACTGGTATGAT 

TGTTTCTTATTCTTGTACCAGATTTCGACAATGTAGAATTTAGTAGGATATGATCATTTT 

GATCTCGTTATATATA 

>G2144 Amino Acid Sequence (domain in AA coordinates : 203-283) 

MDLTGGFGARSGGVGPCREPIGLESLHLGDEFRQLVTTLPPENPGGSFTALLELPPTQAV 

ELLHFTDSSSSQQAAVTGIGGEIPPPLHSFGGTLAFPSNSVLMERAARFSVIATEQQNGN 

ISGETPTSSVPSNSSAl^DRVICTEPAETDSSQRLIS 

KKGKSSTKKNKSSEENEKLPYVHVRARRGQATDSHSLAERAR^ 

DKIQGTALVLDEIINHVQSLQRQVEMLSMRLAAWPRIDFNLDTII^ 

APMQIAWPQQAIETEQSFHHRQLQQPPTQQWPFDGLNQPVWGREEDQAHGNDNSm^ 

ENVMVAS ANLH PNQVKMEL * 
>G2431 (47.. 1057) 

CCCTTTCGTTTTTATTTAAATTTCTTGGGTCGTTTCTTAAATTTGTATGTGTTTATTAAT 
GGAGATCAAC&ATAATGCCAACAATACTAATACT^ 

GAGCCTTGTGTTGTCAACGGATGCTAAGCCAAGGTTGAAATGGACTTGTGATCTTCATCA 
CAAATTCATCGAAG C CGTTAATC AACTTGGAGGAC CTAACAAAG CAACACCTAAGGGTTT 
GATGAAGGTTATGGAGATTCCTGGGCTTACCTTATACC^TCTCAAGAGCCATTTACAGAA 
ATATCGGTTAGGGAAGAGCATGAAGTTCGATGATAAC^GCTAGAAGTTTCCTCTGCATC 
AGAGAATCAAGAAGTTGAGAGTAAAAACGATTCAAGAGATCTCCGAGGCTGCAGTGTCAC 
CGAAGAAAACAGCAATCCAGCTAAAGAAGGGCTACAAA 

GATGGAAGTTCAGAAGAAACTTCATGAACAAATCGAAGTTCAGAGGCATTTGCAGGTGAA 
GATTGAGGCACAAGGAAAGTATCTACAGTCCGTTTTAATGAAAGCTCAACAAACTCTCGC 
TGGCTACTCATCTTCAAATCTCGGCATGGATTTTGCGAGGACCGAGCTCTCTAGATTAGC 
TTCAATGGTGAACAGAGGCTGTCCAAGCACTTCGTTCTCAGAGCTAACGCAAGTAGAAGA 
AGAAGAAGAAGGTTTCTTGTGGTACAAGAAACCAGAAAACAGAGGAATTAGTCAGCTGAG 
ATGTTCAGTAGAGAGCTCGTTGACATCTTCAGAGACCTCAGAGACAAAACTGGATACTGA 
CAATAACCTTAATAAATCGATTGAACTTCCGTTGATGGAGATCAACTCGGAAGTGATGAA 
GGGGAAGAAGAGAAGCATAAACGACGTCGTTTGCGTGGAGCAGCCTCTAATGAAGAGAGC 
TTTTGGAGTTGATGATGATGAGCATTTGAAGTTGAGTTTGAATACTTACAAGAAAGACAT 
GGAGGCGTGTACGAACATAGGACTAGGGTTTAATT7VAAAAAAAAACATTTTACTAAAGTT 
ATATAAAAATGTTTTAAAAGAATCCA 

>G2431 Amino Acid Sequence (conserved domain in AA coordinates : 38-88) 
MCLLME INNNANNTNTT IDNHKAKMSLVLSTDAKPRLKOTCDLHHKFIEAVNQLGGPNKA 
TPKGLMKVME I PGLTLYHLKSHLQKYRLGKSMKFDDNKLEVSSASENQEVES KNDSRDLR 
GCSVTEENSNPAKEGLQITEALQMQMEVQKKLHEQIEVQRHLQVKIEAQGKYLQSV^ 
QQTLAGYSSSNLGMDFARTELSRLASMVira^ 

ISQLRCSVESSLTSSETSETKLDTDNNLNKSIELPLMEINSEvMKGKKRSINDWCVEQP 

LMKRAFGVDDDEHLKLSLNTYKKDMEACTNIGLGFN* 

>G2465 (86.. 1150) 

C^TATTCTTTCTCCATTGAGATTAAGCTTCTTTCTCGCTGTCGTCTCTCT 

GGTTCTTAGTCCCTTTTGAATAATAATGATGGTGGAGATGGATTACGCTAAGAAAATGCA 

GAAATGTCATGAATACGTTGAAGCACTTGAAGAAGAACAGAAGAAAATCCAAGTCTTTCA 

ACGCGAGCTTCCTTTATGTTTAGAGCTTGTCACTCAAGCGATCGAAGCTTGTCGGAAGGA 

GTTATCTGGTACGACGACAACTACATCAGAACAGTGTTCAGAACAGACCACAAGTGTTTG 

TGGTGGTCCTGTCTTTGAAGAGTTTATTCCTATCAAGAAAATTAGTTCCTTGTGTGAAGA 

AGTACAAGAAGAAGAAGAAGAAGATGGTGAACATGAATCTTCTCCAGAACTTGTGAATAA 

TAAGAAATCAGATTGGGTTAGATCTGTTCAGCTATGGAATCATTCACCGGATCTAAATCC 

AAAAGAGGAGCGTGTAGCTAAGAAAGCGAAAGTGGTGGAGGTGAAACCAAAAAGCGGTGC 

GTTTCAGCCGTTTCAAAAGCGCGTTTTGGAGACTGATTTGCAACCGGCGGTGAAAGTAGC 

TAGTTCGATGCCAGCGACGACGACGAGTTCTACGACGGAAACTTGTGGTGGTAAAAGTGA 

TTTGATTAAAGCTGGAGATGAGGAAAGACGGATAGAGCAGCAGCAATCGCAGTCGCATAC 
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GCATAGAAAACAAAGGCGGTGCTGGTCGCCGGAATTACACCGTCGATTCCTAAACGCGCT 
TCAGCAGCTTGGAGGATCTCATGTTGCTACACCAAAGCAAATCAGGGATCACATGAAGGT 
TGATGGATTAACAAACGACGAAGTTAAAAGCCATTTACAGAAATATAGACTTCACACAAG 
AAGGCCAGCAGCAACATCCGTGGCGGCACAAAGTACCGGGAATCAGCAACAACCACAATT 
TGTGGTGGTTGGAGGCATATGGGTACCATCGTCACAAGATTTTCCACCACCGTCCGATGT 
AGCCAAC^GGGTGGTGTATATGCTCCGGTTGCGGTGGCGCAATCTCCAAAACGTTCGTT 
GGAGAGAAGTTGCAACTCGCCGGCGGCATCTTCCTCTACAAATACAAATACTTCTACTCC 
TGTGTCATAATCTGATAGTCATACTATAATCATCTCCTGATGTTGATTTTGGTGTAGGTT 
TGAAAATGTTTATGTGAATGTAA 

>G2465 Amino Acid Sequence (conserved domain in AA coordinates : 219-269) 
MMVTS^YAKKMQKCHE 

S EQCSEQTTS VCGGPVFEEF I P I KKI S S LCEEVQEEEEEDGEHES S PELVNNKKSD WLRS 
VQLWNHSPDLNPKEBRVAKKAKvVEVKPKSGAFQPFQKRVXiETDLQPA 
SSTTETCGGKSDLIKAGDEERRIEQQQSQSHTHRKQRRCWSPELHRRFLNALQQLGGSHV 
ATP KQ I RDHMKVDGLTNDE VKS HLQKYRLHTRRPAATS VAAQSTGNQQQ PQFWVGG I WV 
PSSQDFPPPSDVANKGGVYAPVAVAQSPKRSLERSCNSP^SSSTNTNTSTPVS* 

>G2583 (38.. 607) 

CAAATCAGAAAATATAGAGTTTGAAGGAAACTAAAAGATGGTACATTCGAGGAAGTTCCG 

AGGTGTCCGCCAGCGACAATGGGGTTCTTGGGTCTCTGAGATTCGCCATCCTCTATTGAA 

GAGAAGAGTGTGGCTTGGAACTTTCGAAACGGCAGAAGCGGCTGCAAGAGCATACGACCA 

AGCGGCTCTTCTAATGAACGGCCAAAACGCTAAGACCAATTTCCCTGTCGTAAAATCAGA 

GGAAGGCTCCGATCACGTTAAAGATGTTAACTCTCCGTTGATGTCACCAAAGTCATTATC 

TGAGCTTTTGAACGCTAAGCTAAGGAAGAGCTGCAAAGACCTAACGCCTTCTTTGACGTG 

TCTCCGTCTTGATACTGACAGTTCCCACATTGGAGTTTGGCAGAAACGGGCCGGGTCGAA 

AACAAGTCCGACTTGGGTCATGCGCCTCGAACTTGGGAACGTAGTCAACGAAAGTGCGGT 

TGACTTAGGGTTGACTACGATGAACAAACAAAACGTTGAGAAAGAAGAAGAAGAAGAAGA 

AGCTATTATTAGTGATGAGGATCAGTTAGCTATGGAGATGATCGAGGAGTTGCTGAATTG 

GAGTTGACTTTTGACTTTAACTTGTTGCAAGTCCACAAGGGGTAAGGGTTTTC 

>G2583 Amino Acid Sequence (domain in AA coordinates : 4-71) - . 

MvTISRKFRGWQRQWGSVTOSEIRHPLLKRRVWLGTFET^^ 

NFPVVKSEEGSDHVKDWSPLMSPKSLSELLNAKLRKSCKDLTPSLTCLRLDTDSSHIG 
WQKRAGSKTSPTWVMRLELGNVWESAVT)LGL^ I SDEDQLAME 

MIEELLNWS* 
>G2724 (1..651) 

ATGGAAATAGAAATAAGGAGAGGTCCATGGACTGTGGAAGAAGACATGAAGCTCGTCAGT 
TACATTTCTCTTCACGGTGAAGGAAGATGGAACTCCCTCTCTCGTTCTGCTGGACTGAAT 
AGAACGGGGAAAAGTTGCAGATTGCGGTGGCTAAATTATCTCCGGCCGGATATCCGCCGT 
GGAGACATATCCCTTCAAGAACAATTTATCATCCTTGAACTCCATTCTCGTTGGGGAAAT 
CGGTGGTCAAAGATTGCTCAACATTTACCGGGAAGAACAGATAACGAGATAAAGAATTAT 
TGGAGAAGACGTGTTCAAAAGCATGCAAAACTTCTAAAATGTGACGTGAACAGCAAGCAA 
TTCAAAGACACCATCAAACATCTCTGGATGCCTCGTCTCATCGAGAGAATCGCCGCCACT 
CAAAGTGTCCAATTTACCTCTAACCACTACTCGCCTGAGAACTCCAGCGTCGCCACCGCC 
ACGTCATCAACGTCGTCGTCTGAGGCTGTGAGATCGAGTTTCTACGGTGGTGATCAGGTG 
GAATTTGGAACGTTGGATCATATGAC^^TGGTGGTTATTGGTTCAACGGCGGAGATACG 
TTTGAAACTTTGTGTAGTTTTGACGAGCTCAACAAGTGGCTCATACAGTAG 

>G2724 Amino Acid Sequence (conserved domain in AA coordinates : 7-113) 

ME IEIRRGPWTVEEDMKLVS YI SLHGEGRWNSLSRSAGLNRTGKSCRLRWLNYLRPDIRR 

GD I S LQEQF 1 1 LELHSRWGNRWSKI AQHLPGRTDNE I KNYWRTRVQKHAKLLKCDVNS KQ 

FKDTIKHLWMPRLIERIAATQSVQFTSNHYSPENSSVATATSSTSSSEAVRSSFYGGDQV 

EFGTLDHMTNGGYWFNGGDTFETLCSFDELNKWLIQ* 

>G377 (1. .396) 

atgggtctctcgcattttccaacagcgtcagaaggagtactaccacttctggtgatgaac 
acggttgtttcaatcactctgttgaagaacatggtgaggtctgtttttcaaattgttgca 
tccgagactgaatcttccatggagatagacgacgagcctgaagatgattttgttactaga 
agaatctcgataacacagttcaagtctctatgtgagaacatagaagaggaagaagaagag 
aaaggtgtggagtgttgtgtgtgcctttgtgggtttaaagaggaagaggaagtgagtgag 
ttggtttcttgcaagcatttcttccacagagcttgtctagacaactggtttggtaataac 
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cacaccacatgccctctttgcaggtccattctctag 

>G377 Amino Acid Sequence (domain in AA coordinates : 85-128) 

MGLSHFPTASEGVLPLLVMNTWSITLLKNMVRSVFQIVASETESSMEIDDEPEDDFVTR 

RISITQFKSLCENIEEEEEEKGVECCVCLCGFKEEEEVSEIiVSCKHFFHRACLDNWFGNN 

HTTC PLCRS I L * 
>G428 (97.. 1032) 

TTACTTTTGTGTTTCTTCATATTCTTCAGAAGCAAGCACAAGGCTAGGGATCGAAGAAGC 

GGCGATCACTGATCGTATCTCACTACGATCACATTAATGGATAGAATGTGTGGTTTCCGC 

TCGACGGAAGACTATTCGGAGAAAGCGACGTTGATGATGCCGTCCGATTATCAGTCTTTG 

ATTTGTTCAACCACCGGAGACAATCAAAGACTGTTTGGATCCGACGAACTCGCTACCGCT 

TTGTCCTCGGAGTTGCTTCCGCGTATTCGAAAAGCTGAGGATAATTTCTCTCTTAGTGTC 

ATCAAATCCAAAATCGCTTCTCATCCTTTGTATCCTCGCTTACTCCAAACCTACATCGAT 

TGCCAAAAGGTGGGAGCGCCTATGGAAATAGCGTGTATATTGGAAGAGATTCAGCGAGAG 

AACCATGTGTACAAGAGAGATGTTGCTCCATTATCTTGCTTTGGAGCTGATCCTGAGCTT 

GATGAATTCATGGAAACCTACTGTGATATATTGGTTAAATACAAAACCGATCTTGCGAGG 

C CGTT CGACG AG G CT ACAACTTTCAT AAACAAG ATTG AAATG C AG CTTCAG AACTTGTGC 

ACTGGTCCAGCGTCTGCTACAGCTCTTTCAGATGATGGTGCGGTTTCATCTGACGAGGAA 

CTGAGAGAAGATGATGACATAGCAGCGGATGACAGCCAACAAAGAAGCAATGACCGCGAT 

CTGAAGGACCAGCTACTACGCAAATTTGGTAGCCATATCAGTTCATTGAAACTCGAGTTC 

TCTAAAAAGAAGAAGAAAGGGAAGCTACCAAGAGAAGCAAGACAAGCGTTGCTCGATTGG 

TGGAATGTTCATAATAAATGGCCTTACCCTACTGAAGGCGACAAAATAGCTCTGGCTGAA 

GAAACAGGTTTGGATCAAAAACAAATCAACAATTGGTTTATAAACCAAAGGAAACGCCAT 

TGGAAGCCTTCGGAGAACATGCCGTTTGATATGATGGACGATTCTAATGAAACATTCTTT 

ACCGAGGAATGAAAAGAGAGACATGGGATTGTGCATTGTATAATTTTTAGACTGTTTTCC 

CAAGAAAAGAAAACAGTAAAAAGCTTTTGGTAAATGGGACATCATCGCGAATGAATGGAA 

CCAGTTAGCCAAAACGGTCAAGGGCGTGGCGTAACGAGACATTGTATTGGAAATAGTGGC 

AATATTATGTCACTAATCTTCCAATGGTCCAAAATGATAGATTTCTTATTTGTATirGAAC 

CTTACTTAGATAGCTGATGTGTCAACTAAATAATTTATTTTCATCCTTATACTACTTGTA 

TCAATGTCTCTAATTGATCAATTGTTGCTTGCTATTCAAAAAAAAA^ 

>G428 Amino Acid Sequence (domain in AA coordinates: 229-292) 

^RMCGFRSTEDYSEKATI^PSDYQSLICSTTGDNQRLFGSDELATALSSELLPRIRKA 

EDNFSLSVIKSKIASHPLYPRLLQTYIDCQKVGAPMEIACILEEIQRENHVYKRDVAPLS 

CFGADPEUDEFMETYCDILVKYKTDLARPFDEATTFINKIEMQLQNLCTGPASATALSDD 

GAVSSDEELREDDDIAADDSQQRSOTRDLKDQLLRKFGSHISSLKLEFSKKKKKGKLPRE 

ARQALLDWWNVHNKWPYPTEGDKIALAEETGLDQKQINNWFINQRKRHWKPSENM 

DDSNETFFTEE* 
>G447 (241.. 3501) 

CTTTTTAAGAGCTTAAAAATTTGCTTTGAAGCTTCAAATATTCTTATGAACTAAAAAGAA 



ACTATTTAGTTTCTCTCGTGCTCTTCTCTTGAGCAAATACAGATTCGTTAATTT^ 

AGAAGAAGAACTCTGTTTCTTCCCTGCACCAAACCAATTTTTTCGTTCTTTCTATAAACC 

ATGAAAGCTCCATCAAATGGATTTCTTCCAAGTTCCAACGAAGGAGAGAAGAAGCCAATC 

AATTCTCAACTATGGCACGCTTGTGCAGGGCCTTTAGTTTCATTACCTCCTGTGGGAAGT 

CTTGTGGTTTACTTCCCTCAAGGACACAGCGAGCAAGTTGCAGCATCGATGCAGAAGCAA 

ACAGATTTTATACCAAATTACCCAAATCTTCCTTCTAAGCTGATTTGCTTGCTTCACAGT 

GTTACATTACATGCTGATACCGAAACAGATGAAGTCTATGCACAAATGACTCTTCAACCT 

GTGAATAAGTATGATAGAGAAGCATTGCTAGCTTCTGATATGGGCTTGAAGCTAAACAGA 

CAACCTACTGAGTT^TTTTGCAAGACTCTTACTGCAAGTGACACAAGCACTCAT 

TTCTCTGTACCGCGTCGTGCAGCTGAGAAAATATTCCCTCCTCTTGATTTCTCGATGCAA 

CCGCCTGCGCAAGAGATTGTAGCTAAAGATTTACATGATACTACATGGACTTTCAGACAT 

ATCTATCGAGGCCAACCAAAAAGACACTTGCTTACCACAGGTTGGAGCGTTTTTGTTAGC 

ACAAAGAGACTATTTGCGGGTGATTCAGTTTTGTTTGTAAGAGATGAGAAATCACAGCTG 

ATGTTGGGTATAAGACGTGCAAATAGACAAACTCCGACTCTTTCCTCATCGGTCATATCC 

AGCGACAGTATGCACATTGGGATACTTGCAGCTGCAGCTCATGCTAATGCCAATAGTAGC 

CCTTTTACCATCTTCTTCAATCCAAGGGCAAGTCCTTCAGAGTTTGTAGTTCCTTTAGCC 

AAATACAACAAAGCCTTATACGCTCAAGTATCTCTAGGAATGAGATTCCGGATGATGTTT 

GAGACTGAGGATTGTGGGGTTCGTAGATATATGGGTACAGTCACAGGTATTAGTGATCTT 
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GACCCTGTAAGATGGAAAGGCTCACAATGGCGTAATCTTCAGGTAGGATGGGATGAATCA 

ACAGCTGGAGATAGGCCAAGCCGAGTATCCATATGGGAAATCGAACCCGTCATAACTCCT 

TTTTACATATGTCCTCCTCCATTTTTCAGACCTAAGTACCCGAGGCAACCCGGGATGCCA 

GATGATGAGTTAGACATGGAAAATGCTTTCAAAAGAGCAATGCCTTGGATGGGAGAAGAC 

TTTGGGATGAAGGACGCACAGAGTTCGATGTTCCCTGGTTTAAGTCTAGTTCAATGGATG 

AGTATGCAGCAAAACAATCCATTGTCAGGTTCTGCTACTCCTCAGCTCCCGTCCGCGCTC 

TCATCTTTTAACCTACCAAACAATTTTGCTTCCAACGACCCTTCCAAGCTGTTGAACTTC 

CAATCCCCAAACCTCTCTTCCGCAAATTCCCAATTCAACAAACCGAACACGGTTAACCAT 

ATCAGCCAACAGATGCAAGCACAACCAGCCATGGTGAAATCTCAACAACAACAACAACAA 

CAACAACAACAACACCAACACCAACAACAACAACTGCAAC^^CAACT^CAACTACAGATO 

TCACAGCAACAGGTGCAGCAACAAGGGATTTATAACAATGGTACGATTGCTGTTGCTAAC 

CAAGTCTCTTGTCAAAGTCCAAACCAACCTACTGGATTCTCTCAGTCTCAGCTTCAGCAG 

CAGTCAATGCTCCCTACTGGTGCTAAAATGACACACCAGAACATAAATTCTATGGGGAAT 

AAAGGCTTGTCTCAAATGACATCGTTTGCGCAAGAAATGCAGTTTCAGCZAGCAACTGGAA 

ATGCATAACAGTAGCC AGTTATTAAGAAACCAGCAAGAAC AGTCCTCTCT C CATTCAT TA 

CAACAAAATOTGTCCCAAAATCCTCAGCAACTCCAAATGCAACAACAATCATCAAAAC^ 

AGTCCTTCACAACAGCTTCAGTTGCA6CTACTGCAGAAGCTA 

CAGTCGATTCCTCCAGTAAGCTCATCCTTACAGCCACAATTATCAGCGTTGCAGCAGACA 

CAAAGCCATCAATTGCAACT^CTTCTGTCGTCTCAAAATCAACAGCCCTTGGCACATGGT 

AATAACAGCTTCCCAGCTTCAACTTTCATGCAGCCTCCACAGATTCAGGTGAGTCCTCAG 

CAGCAAGGACAGATGAGTAACAAAAATCTTGTAGCCGCTGGAAGATCACATTCTGGCCAC 

ACAGATGGAGAAGCTCCTTCTTGTTCAACCTCACCTTCCGCCAATAACACGGGACATGAT 

AATGTTTCACCGACAAATTTCCTGAGCAGAAATCAACAGCAAGGACAAGCTGCATCTGTA 

TCTGCATCTGATTCAGTCTTTGAGCGCGCAAGCAATCCGGTCCAAGAGCTTTATACAAAA 

ACTGAGAGCCGGATCAGTCAAGGCATGATGAATATGAAGAGTGCTGGTGAACATTTCAGA 

TTTAAAAGCGCGGTAACAGATCAAATCGATGTATCCACAGCGGGAACGACGTACTGTCCT 

GATGTTGTTGGCCCTGTACAGCAGCAACAAACTTTCCCACTACCATCATTTGGTTTTGAT 

GGAGACTGCCAATCTCATCATCGAAGAAACAACTTAGCTTTCCCTGGTAATCTCGAAGCC 

GTAACTTCTGATCCACTCTATTCTCAAAAGGACTTTCAAAACTTGGTTCCCAACTATGGC 

AACACACCAAGAGACATTGAGACGGAGCTGTCCAGTGCTGCAATCAGTTCTCAGTCATTT 

GGTATTCCCAGCATTCCCTTTAAGCCCGGATGTTCAAATGAGGTTGGCGGCATCAATGAT 

TCAGGAATCATGAATGGTGGAGGACTGTGGCCCAATCAGACTCAACGAATGCGAACATAT 

ACAAAGGTTCAAAAACGAGGGTCAGTAGGTAGATCAATAGATGTTACCCGTTATAGCGGC 

TATGATGAACTTAGGCATGACTTAGCGAGAATGTTTGGCATCGAAGGACAGCTCGAAGAT 

CCGCTAACCTCTGATTGGAAACTCGTCTACACCGATCACGAAAACGATATTTTACTAGTT 

GGTGATGATCCTTGGGAAGAGTTTGTGAACTGCGTGCAGAACATAAAGATACTATCATCA 

GTAGAAGTTCAGCAAATGAGCTTAGACGGAGATCTTGCAGCTATCCCAACCACAAACCAA 

GCCTGCAGCGAAA(^GACAGCGGAAATGCTTGGAAAGTACACTATGAAGACACTTCTGCT 

GCAGCTTCTTTCAAC^GATAGAAATAAAAAGATGC 

TCATTCGAGGCCATCGCAAAGTAC^TGTTTTTTTTTGTGTGTATGTACTGCAAACAACAA 
ACTGAGAAGAAGAAGATACTGCACGGTATATAAACATTTTTATAGGACAGTGATTTGATT 
TTTC^TTCTAACTTGATGTTGTTGTACTTTCTTGTTTCCATATTTGTATAACAAGTATAA 
TGCTTGACAAGTCTATGAGGAGCATATCTTATACAGAGATACTAAGATGTAATGTTAATG 
TAACTAAACAATTACCTTCATTAATCATGAATCCTTTGGTCGTTTAAAA 

>G447 Amino Acid Sequence (conserved domain in AA coordinates : 22-356) 
MKAPSNGFLPSSNEGEKKPINSQLWHACAGPLVSLPPVGSLWYFPQGHSEQVAASMQKQ 

TDFIPNYPl^PSKLICLLHSVTLHADTETDEV^^ 

QPTEFFCKTLTASDTSTHGGPSVPRRAAEKIFPPLDFSMQPPAQEIVAKDLHDTTWTFRH 

IYRGQPKRHLLTTGWSVFVSTKRXjFAGDSVLFVRDEKS^ 

SDSMHIGILAAAAHANANSSPFTIFFNPRASPSEFVVPIiAKYNK^ 

ETEDCGVRRYMGTVTG I SDLDP VRWKGS QWRNLQVGWDESTAGDRPSRVS IWE I EPVITP 

FYICPPPFFRPKYPRQPGMPDDELDl^NAFKRAMPWMGEDFGMKDAQSSMFPGLSLVQWM 

SMQQI^PLSGSATPQLPSAIjSSFNLPNNFASNDPSKIjLNFQSPNLSSANSQFNKPNTVIJH 

IS QQMQAQPAMVKSQQQQQQQQQQHQHQQQQLQQQQQLQMSQQQVQQQG I YNNGTIAVAN 

QVSCQSPNQPTGFSQSQLQQQSMLPTGAKMTHQNINSMGNKGLSQMTSFAQEMQFQQQLE 

MHNSSQLLRNQQEQSSLHSLQQNLSQNPQQLQMQQQSSKPSPSQQLQLQLLQKLQQQQQQ 

QSIPPVSSSLQPQLSALQQTQSHQLQQLLSSQNQQPLAHGNNSFPASTFMQPPQIQVSPQ 
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QQGQMSNKNLVAAGRSHSGHTDGEAPSCSTSPSANNTGHDNVSPTNFLSRNQQQGQAASV 
SASDSVFERASNPVQELYTKTESRISQGMMNMKSAGEHFRFKSAVTDQIDVSTAGTTYCP 
DWGPVQQQQTFPLPSFGFDGDCQSHHPRNIHjAFPGNLEAVTSDPLYSQKDFQNLVPNYG 
NTPRDIETELSSAAI S SQSFGI PS I PFKPGCSNBVGGINDSGIMNGGGLWPNQTQRMRTY 
TKVQKRGSVGRSIDVTRYSGYDELRHDIiARMFGIEGQLEDPLTSDWKLVYTDHENDILLV 
GDD PWEE FVNCVQNIKI LSS VEVQQMSLDGDLAAI PTTNQAC SETDSGNAWKVHYEDTSA 
AASFNR* 

>G464 (41.. 760) 

CTCTGCTGGTATCATTGGAGTCTAGGGTTTTGTTATTGACATGCGTGGTGTGTCAGAATT 
GGAGGTGGGGAAGAGTAATCTTCCGGCGGAGAGTGAGCTGGAATTGGGATTAGGGCTCAG 
CCTCGGTGGTGGCGCGTGGAAAGAGCGTGGGAGGATTCTTACTGCTAAGGATTTTCCTTC 
CGTTGGGTCTAAACGCTCTGCTGAATCTTCCTCTCACCAAGGAGCTTCTCCTCCTCGTTC 
AAGTCAAGTGGTAGGATGGCCACCAATTGGGTTACACAGGATGAACAGTTTGGTTAATAA 
CCAAG CTATGAAGGCAG CAAG AGCGGAAGAAGGAGACGG GGAGAAGAAAGTTGTGAAGAA 
TGATGAGCTCAAAGATGTGTC^TGAAGGTGAATCCGAAAGTTCAGGGCrTAGGGTTTGT 
TT^AGGTGAATATGGATGGAGTTGGTATAGGCAGAAAAGTGGATATGAGAGCTCATTCGTC 
TTACGAAAACTTGGCTCAGACGCTTGAGGAAATGTTCTTTGGAATGACAGGTACTACTTG 
TCGAGAAAAGGTTAAACCTTTAAGGCTTTTAGATGGATCATCAGACTTTGTACTCACTTA 
TGAAGATAAGGAAGGGGATTGGATGCTTGTTGGAGATGTTCCATGGAGAATGTTTATCAA 
CTCGGTGAAAAGGCTTCGGATCATGGGAACCTCAGAAGCTAGTGGACTAGCTCCAAGACG 
TC^GAGCAGAAGGATAGACAAAGAAACAACCCTGTTTAGCTTCCCTTCCAAAGCTGGCA 
TTGTTTATGTATTGTTTGAGGTTTGCAATTTAOTCGATACTTTTTGAAGAAAGTATTTTG 
GAGAATATGGATAAAAGCATGCAGAAGCTTAGATATGATTTGAATCCGGTTTTCGGATAT 
GGTTTTGCTTAGGTCATTCAATTCGTAGTTTTCCAGTTTGTTTCTTCTTTGGCTGTGTAC 
CAATTATCTATGTTCTGTGAGAGAAAGCTCTT 

>G464 Amino Acid Sequence (domain in AA coordinates: 20-28, 71-82, 126-142, 187- 
224) 

MRGVS EIiEVGKSNLPAESELELGLGLSLGGGAWKERGRI LTAKDFPS VGSKRS AES S SHQ 

GASPPRSSQWGWPPIGLHRMNSLVNNQAMKAARAEEGDGEKKVVK^ 

VQGLGFVKVNMDGVGIGRKVDMRAHSSYENIA^ 

SDFVLTYEDKEGDWMLVGDVPWRMFINSVKRLRIMGTSEASGIiAPRRQEQKDRQRNNPV* 
>G557 (192.. 698) 

CAGAGATCTGACGGCGGTAGCAGAGTAATCTATTCCTTCCCAAAATGTCTCGCAATTAGA 
TTCTTTCCAAGTTCTTCTGTAAATCCCAAGTCCCGCTCTTTTCCTCTTTATCCTTTTCAC 
CAGCTTCGCTACTAAGACAACAAATCTTTCCCTCTCTCTCTCGCCTGATCGATCTTCAAA 
GAGTAAGAAAAATGCAGGAACAAGCGACTAGCTCTTTAGCTGCAAGCTCTTTACCATCAA 
GCAGCGAGAGGTCATCAAGCTCTGCTCCACATTTGGAGATCAAAGAAGGAATTGAAAGCG 
ATGAGGAGATACGGCGAGTGCCGGAGTTTGGAGGAGAAGCTGTCGGAAAAGAAACTTCCG 
GTAGAGAATCTGGATCGGCGACCGGTCAGGAGCGGACACAGGCGACTGTCGGAGAAAGTC 
AAAGGJ\AGCGAGGGAGGACACCGGCGGAGAAAGAGAACAAGCGGCTGAAGAGGTTGTTGA 
GGAACAGAGTTTCAGCTCAGCAAGCAAGAGAGAGGAAAAAGGCTTACTTGAGCGAGTTGG 
AAAACAGAGTGAAAGACTTGGAGAACAAAAACTCTGAACTTGAAGAGCGACTCTCTACTC 
TTCAGAACGAGAACCAGATGCTTAGACATATTCTGAAGAACACAACAGGAAACAAGAGAG 
GAGGTGGTGGTGGTTCTAATGCTGATGCAAGCCTTTGATCTCCTTCTTCTTCTTGTGTTA 
TATTTTTGTGGATAAAATTTACAGAGAATTGTATCAATAATTATCATGTTAAAATTATAT 
GGGATGTGAGAGCTAATATTGCAATTGTAGACCAAGTTCTCTTA7VAAAAAAAAAAAAAAA 

AA 

>G557 Amino Acid Sequence (domain in AA coordinates: 90-150) 

MQEQATSSIiAASSLPSSSERSSSSAPHLEIKEGIESDEEIRRVPEFGGEAVGKETSGRES 

GSATGQERTQATVGESQRKRGRTPAEKENKRLKRLLRNRVSAQQARERKKAYLSEIiENRV 

KDLENKNSELEERLSTLQNENQMbRHILKNTTGNKRGGGGGSNADASL* 

>G577 (44.. 2155) 

AAAAACAGACTGAGAGAGAGAGAGAGAGAGTGTGTTGTTGGCCATGGGATGCACGGCCTC 
CAAGCTCGACAGTGAGGATGCTGTCCGTCGCTGCAAGGAGCGGCGCCGTCTTATGAAGGA 
CGCCGTCTACGCTCGTCACCATCTCGCCGCCGCTCACTCTGACTACTGCCGCTCCCTTCG 
TCTCACTGGCTCTGCCCTCTCCTCCTTCGCCGCCGGCGAGCCCCTCTCCGTCTCCGAGAA 
TACTCCCGCTGTTTTTCTCCGCCCTTCCTCCAGTCAGGACGCGCCACGTGTCCCTTCTTC 
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CCATTCCCCAGAACCCCCTCCtCCGCCCATCCGCAGCAAGCCTAAGCCTACTAGGCCTAG 
GAGGCTTCCAGACATTCTCTCCGACTCCTCT 

TCCCACTGCTCACCAGAACTCTACTTACTCTCGCTCTCCATCTCAAGCTTCCTCTGTCTG 
GAACTGGGAGAATTTCTACCCTCCCTCTCCCCCCGACTCCGAGTACTTCGAACGCAAAGC 
TCGCCAGAACCACAAGCACCGTCCTCCTTCCGACTACGACGCCGAAACTGAAAGATCCGA 
CCACGATTACTGCCACTCACGGAGAGATGCCGCCGAGGAAGTTCACTGCAGCGAGTGGGG 
CGACGACCACGACCGTTTCACTGCCACCTCTTCGTCCGACGGAGATGGGGAGGTCGAAAC 
TCACGTTTCCAGATCCGGTATTGAAGAAGAGCCTGTGAAACJ^ACCACATCZAAGACCCAAA 
TGGCAAAGAGCACTCTGACCATGTTACCACTTCTTCCGACTGCTACAAGACCAAATTGGT 
GGTAAGGCACAAGAATTTGAAGGAGATCCTTGACGCCGTTCAAGACTACTTCGACAAGGC 
TGCCTCCGCTGGGGACCAGGTCTCCGCCATGCTTGAGATCGGCCGGGCTGAGCTCGACCG 
CAGCTTCAGCAAGCTGAGGAAGACGGTGTATGATTCAAGCAGTGTGTTCAGCAACTTGAG 
CGCAAGCTGGACCTCAAAACCCCCATTGGCAGTCAAATACAAGCTCGATGCATCTACCCT 
GAATGATGAACAAGGCGGCCTCAAGAGCCTCTGCTCCACTCTAGACCGACTCCTCGCTTG 
GGAAAAGAAGCTTTATGAGGATGTCAAGGCAAGAGAAGGAGTTAAGATTGAGCACGAGAA 
GAAGCTGTCTGCGCTGCAGAGTCAGGAGTATAAGGGAGGTGATGAATCCAAGCTAGACAA 
GAGTAAAACTTCCATAACCAGACTGCAATCACTCATCA^ 

AACCACGTCTAATGCCATTCTCCGCCTCCGGGACACTGACCTTGTCCCTCAGCTTGTTGA 
ACTCTGCCACGGATTAATGTACATGTGGAAGTCAATGCACGAGTATCACGAAATCCAGAA 
CAACATCGTGCAACAAGTCCGTGGCCTGATCAACCAAACAGAGAGAGGTGAGTCAACATC 
AGAGGTACACCGGCAGGTGACGCGGGACCTAGAGTCAGCTGTGTCCTTGTGGCATTCGAG 
CTTCTGTCGCATCATTAAATTCCAGAGGGAGTTCATATGCTCTCTCCACGCATGGTTCAA 
GCTGAGCCTGGTTCCCCTGAGCAACGGAGACCCAAAGAAACAGCGGCCAGACTCATTTGC 
CTTGTGCGAGGAGTGGAAGCAGAGCCTGGAACGGGTGCCTGACACAGTGGCGTCAGAAGC 
CATAAAGAGCTTTGTAAACGTGGTACATGTGATATCAATAAAGCAGGCGGAAGAGGTGAA 
GATGAAGAAACGCACGGAGAGTGCAGGAAAGGAGCTGGAGAAGAAAGCATCCTCACTGAG 
GAGCATAGAGAGGAAGTACTACCAGGCATACTCGACGGTTGGGATAGGCCCTGGACCGGA 
GGTGTTGGACTCACGGGACCCGCTATCTGAGAAGAAATGTGAGCTGGCGGGATGTCAGAG 
GCAGGTGGAGGATGAGGTAATGAGGCACGTGAAGGCTGTGGAGGTGACACGAGCTATGAC 
TCTCAACAATCTACAAACCGGCCTGCCCAATGTATTCCAGGCCTTGACCAGCTTCTCATC 
TCTCTTCACTGAATCTCTCCAGACTGTCTGTTCTCGTTCCTACTCCATCAACTGATTATG 
TCCAAGTTTCTCATTTATTTTTAAGCTCTCATTACGTTGGTATCATGTAAATTTGAGGAT 
TGATTAAATTGAGTCTTGTGGTTTTGTGAGGACTCACAATCTTTCTCATTTAAAAAAAAA 

AAAAAAAAAA 

>G577 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGCTASKLDSEDAVRRCKERRRLMKDAVYARHHLAAAHSDYCRSLRIiTGSALSSFAAGEP 

LSVSENTPAVFLRPSSSQDAPRVPSSHSPEPPPPPIRSKPKPTRPRRLPHILSDSSPSSS 

PATSFYPTAHQNSTYSRSPSQASSVWli^NFYPPSPPDSEYFERKARQNHKHRPPSDyDA 

ETERSDHDYCHSRRDAAEEVHCSEWGDDHDRFTATS S SDGDGEVETHVSRSGI EEEPVKQ 

PHQDPNGKEHSDHVTTS SDC YKTKL WRHKNLKE I LDAVQD YFDKAAS AGDQVS AMLE I G 

RAELDRSFSKLRKTVYHSSSVFSNLSASWTSKPPLAVKYKLDASTLND 

DRLLAWEKKLYEDVKTUIEGVKIEHEKKLSALQSQEYKGGDESKLDKTKTS ITRLQSLI IV 

S SEAVLTTSNAI LRLRDTDLVPQLVELCHGLMYMWKSMHE YHE IQNNIVQQVRGL INQTE 

RGESTSEVHRQVTRDLESAVSLWHSSFCRIIKFQREFICSLHAWFKLSLVPLSNGDPKKQ 

RPDSFALCEEWKQSLERVPDTVASEAIKSFV15r\A7HVISIKQAEEVKMKKRT^ 

KASSLRSIERKYYQAYSTVGIGPGPEVLDSRDPLSEKKCELAACQRQVEDEVMRHVKAVE 

VTRAMTLNNLQTGLPNVFQALTSFSSLFTESLQTVCSRSYSIN* 

>G674 <1..786)r- 

ATGGTGTTTAAATCAGAAAAATCAAACCGGGAAATGAAATCAAAGGAGAAGCAAAGGAAG 
GGATTATGGTCACCCGAGGAAGATGAGAAGCTTAGGAGTCATGTCCTCAAATATGGCCAT 
GGATGCTGGAGTACTATTCCTCTTCAAGCTGGATTGCAGAGGAATGGGAAGAGTTGTAGA 
TTAAGGTGGGTTAATTATTTAAGACCTGGACTTAAGAAGTCTTTATTCACTAAACAAGAG 
GAAACTATACTTCTTTCACTTCATTCCATGTTGGGTAACAAATGGTCTCAGATATCGAAA 
TTCTTAC(^GGAAGAACCGACAACGAGATCAAAAACTATTGGCATTCTAATCTAAAGAAG 
GGTGTAACTTTGAAACAA(^TGAAACCACAAAAAAACATC^ 

TCACTTGAGGCCTTGCAGAGTTCAACTGAAAGATCTTCTTCATCTATCAATGTCGGAGAA 
ACGTCTAATGCTCAAACCTCAAGCTTTTCGCCAAATCTCGTGTTCTCGGAATGGTTAGAT 
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CATAGTTTGCTTATGGATCAGTCACCTCAAAAGTCTAGCTATGTTCAAAATCTTGTTTTA 

CCGGAAGAGAGAGGATTCATTGGACCATGTGGCCCTCGTTATTTGGGAAACGACTCTTTG 

CCTGATTTCGTGCCAAATTCAGAATTTTTGTTGGATGATGAGATATCATCTGAGATCGAG 

TTCTGTACTTCATTTTCAGACAACTTTTTGTTCGATC 

ATGTAA 

>G674 Amino Acid Sequence (domain in AA coordinates: 20-120) 
MVFKSEKSNREMKSKEKQRKGLWSPEEDEKLRSHVLKYGHGCWSTIPIiQAGLQRNGKSCR 
LRWVNYLRPGLKKSLFTKQEETILLSLH^ 

GVTLKQHETTKKHQTPLITNSLEALQSSTERSSSSINVGETSNAQTSSFSPNLVFSEWLD 
HSLLMDQSPQKSSYVQNLVLPEERGFIGPCGPRYLGNDSLPDFVPNSEFLLDDEISSEIE 
FCTSFSDNFLFDGLINELRPM* 
>G736 (1..513) 

ATGGCGACTCAAGATTCTCAAGGGATTAAACTCTTTGGCAAAACTATTGCATTTAACACT 

CGAACAATAAAAAATGAAGAAGAGACACACCCGCCGGAGCAAGAAGCCACAATAGCCGTT 

AGATCATCATCATCATCGGATCTGACGGCCGAGAAGCGTCCGGATAAGATCATAGCATGT 

CC^^GATGCAAGAGCATGGAGACTVAAGTTCTGTTACTTCAACAACTACA 

CCTCGACACTTTTGTAAAGGCTGCCACCGTTACTGGACCGCCGGTGGTGCACTCCGGAAC 

GTTCCCGTCGGCGCCGGTCGTCGGAAGTCCAAACCACCTGGTCGTGTCGTGGTTGGTATG 

CTTGGAGATGGAAATGGTGTTCGCCAAGTCGAGCTTATAAATGGCTTGCTCGTTGAGGAG 

TGGCAGCATGCCGCAGCCGCAGCTCACGGTAGTTTCCGGCATGATTTTCCCATGAAGCGG 

CTCCGGTGTTACTCCGACGGTCAATCGTGCTGA 

>G736 Amino Acid Sequence (domain in AA coordinates: 54-111) 
MATQDSQGIKLFGKTIAFNTRTIKNEEETHPPEQEATIAVRSSSSSDLTAEKRPDKIIAC 
PRCKSMETKFCYFNNYNGNQPRHFCKGCHRYWT^^ 

LGDGNGVRQVELINGLLVEEWQHAAAAAHGSFRHDFPMKRLRCYSDGQSC* 
>G903 (96.. 1496) 

CCCGGGTCGACCCACGCGTCCGCTCTCTCTCTCTGAACTATACAAAAACCTACTTTTAAT 
TTCTCTTCCAAGAAGTCAAGAACCCAGAAGAAGACATGACAAGTGAAGTTCTTCAAACAA 
TCTCAAGTGGATCAGGTTTTGCT^GCCACAGAGCTO^TCAACCCTGGATCATGATGAAT 
CTCTCATCAATCCTCCTCTTGTTAAGAAAAAGAGAAATCTCCCTGGAAATCCTGATCCGG 
AAGCTGAAGTGATAGCTTTATCCCCCACGACCTTGATGGCTACGAACCGGTTCCTATGTG 
AGGTATGTGGC^UUVGGTTTCCAAAGAGACCAAAACTTACAGCTTCATCGGCGAGGACATA 
ATCTTCCATGGAAGTTGAAGCAGAGGACAAGCAAAGAAGTGAGAAAACGTGTCTACGTTT 
GCCCCGAGAAGACATGTGTCCACCATCACTCCTCTAGAGCTCTAGGCGATCTCACTGGAA 
TCAAAAAGCATTTTTGCCGGAAACACGGGGAGAAGAAGTGGACGTGCGAGAAATGTGCTA 
AGAGATACGGAGTCCAATCTGATTGGAAAGCTCATTCCAAGACTTGTGGTACTAGAGAGT 
ACCGTTGCGATTGTGGC^CCATTTTCTCAAGGCGAGACAGCTTTATCACTCATAGAGCTT 
TCTGCGATGCCTTAGCGGAAGAAACCGCTAAGATAAACGCAGTGTCTCATCTCAACGGTT 
TAGCCGCGGCTGGAGCCCCAGGATCAGTTAATCTCAACTATCAATATCTCATGGGAACAT 
TCATCCC^CCGCTTC^CCATTTGTACC&CAAC^ 

AAC^TTTTCAGCCACCAACTTCTTCGTCGCTCTCTCTATGGATGGGACAAGATATCGCGC 

CGCCTCAACCGCAACCGGACTACGATTGGGTTTTTGGAAACGCTAAGGCAGCGTCTGCTT 

GCATTGATAATAATAATACTCACGATGAGCAGATTACGCAAAACGC^^ 

CCACTACCACTACTCTCTCTGCCCCTTCTTTATTCAGCAGCGACCAACCACAAAACGCAA 

ACGCAAATTCAAACGTGAATATGTCCGCGACAGCTTTACTACAGAAAGCTGCTGAAATTG 

GCGCTACTTCTACAACAACCGCAGCGACCAATGACCCATCAACGTTTCTTCAAAGTTTCC 

CGCTTAAATCCACGGATCAAACC^CCAGTTATGACAGTGGCGAAAAGTTTTTTGCTTTGT 

TCGGGTCTAACAAGAACATTGGGTTAATGAGTCGTAGTCATGATCATCAAGAGATCGAGA 

ACGCTAGAAATGACGTTACGGTTGCGTCTGCCTTGGATGAATTACAGAATTACCCTTGGA 

AACGTAGAAGAGTTGATGGTGGAGGTGAAGTGGGTGGAGGAGGGCAAACTCGGGATTTCC 

TCGGGGTTGGTGTACAAACGTTGTGCCATCCATCGTCTATCAATGGATGGATTTGAAAGA 

GTTTAAAAATTTCGGGGTTAATGCATAAATTACGTAAAAGAAGAAGGAATCTTTTGTCAT 

TTCCACCATTTTCTAAGATAACATATGTATATGGTAATGGAAGTTGTTTTCTTTTATTAA 

TTCAATATTCTAAAACTTATGATATATGTATAATGAATGTGTTTATCTTCAAA 

>G903 Amino Acid Sequence (domain in AA coordinates: 68-92) 

MTSEVLQTISSGSGFAQPQSSSTLDHDESLINPPLVKKKRNLPGNPDPEAEVIALSPTTL 

MATORFLCEVCGKGFQRDQNLQLHRRGHNLPWKL 
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RALGDLTGIKKHFCRKHGEKKWTCEKCAKRYAVQSDWKAHSKTCGTREYRCDCGTIFSRR 

DSFITHR^CDALAEETAKINAVSHLNGLAAAGAPGSVNliNYQYIiM 

QTNPNHHHQHFQPPTSSSLSLWMGQDIAPPQPQPDYDWVFGNAKAASACIDNNNTHDEQI 

TQNANASLTTTTTLSAPSLFSSDQPQNANANSNVNMSATALLQKAAEIGATSTTTAATND 

PSTFLQSFPLKSTDQTTSYBSGEKFFALFGSNNNIGLMSRSHDHQEIENARNDVTVASAL 

DELQNYPWKRRRVDGGGEVGGGGQTRDFLGVGVQTLCHPSSINGWI* 

>G917 (32.. 679) 

TTAGGGTTTTAGAAAGATAGATCGATTGAAGATGAGGAAAGGTAAGAGAGTGATAAAAAA 
GATAGAGGAGAAAATAAAGAGACAAGTGACATTCGCAAAGAGAAAGAAGAGTCTAATCAA 
GAAGGCATATGAACTCTCTGTTCTCTGCGATGTCCACCTTGGTCTCATCATCTTCTCTCA 
CTCCAACAGGCTCTACGATTTCTGCTCCAACTCTACCAGCATGGAGAATCTCATCATGAG 
ATACCAAAAGGAAAAAGAAGGTCAAACCACTGCAGAACACAGTTTCCACTCGGATCAGTG 
TTCAGATTGCGTGAAGACGAAGGAATCAATGATGAGAGAGATAGAGAATCTTAAGCTGAA 
TCTTC^TTGTACGACGGACATGGCTTGAATCTC 

TGAGCTCCATCTCGAATCTTCTCTACAACATGCTCGAGCTCG(^GTCTGAGTTCATGCA 
TCAGCAGCAGCAGCAACAAACAGATCAAAAGCTTA^^ 

CTCTTGGGAGCAGCTGATGTGGCAAGCAGAGAGACAGATGATGACGTGTCAAAGACAAAA 
AGATCCTGCGCCGGCGAATGAAGGAGGAGTTCCTTTTTTACGGTGGGGAACAACCCACCG 
ACGTTCTTCACCTCCTTAAGCTACCACAACCAGGCCCAAATACAGGCCCATAACTTCTCT 
CTATCTATAAAAAACAACTGATAGTAAAAAGTATTGACCCGGTTTGGTTCGGTTATGTTG 
ATACCAGACTATTAATTAACTTCGGTTAGACGTATTTACGACTTGATGCTATCTAGACCT 

XTTTGC CCTTCAAAAAAA 

>G9X7 Amino Acid Sequence (conserved domain in AA coordinates : 2-57) 

mKGKRVIIOaEEKIKRQVTFAKRKKSLI 

STSMENLIMRYQKEKEGQTTAEHSFHSDQCSDCVOT 

LLTYDELLSFEiaLESSLQHARARKSEFMHQQQQQQTDQKLKGKEKGQGSSWEQL^QM 

RQMMTCQRQKDPAPANEGGVPFLRWGTTHRRSSPP* 

>G921 (116.. 1024) 

CCAAGATCGACTCTTACTTCGAATCTCTCTCAACTTTCTTCCTCAGCTTACGGGAACTTC 
GACACATATAt^TCCACAAGAACCCATATCG 

TCAGTACT(^TCCTCTTTGGTCGATACTTCATTAGATCTCACTATTGGCGTTACTCGTAT 

GCGAGTTGAAGAAGATCCACCGACAAGTGCTTTGGTGGAAGAATTAAACCGAGTTAGTGC 

TGAGAACAAGAAGCTCTCGGAGATGCTAACTTTGATGTGTGACAACTACAACGTCTTGAG 

GAAGCAACTTATGGAATATGTTAACAAGAGCAACATAACCGAGAGGGATCAAATCAGCCC 

TCCCAAGAAACGCAAATCCCCGGCGAGAGAGGACGCATTCAGCTGCGCGGTTATTGGCGG 

AGTGTCGGAGAGTAGCTCAACGGATCAAGATGAGTATTTGTGTAAGAAGCAGAGAGAAGA 

GACTGTCGTGAAGGAGAAAGTCTCAAGGGTCTATTACAAGACCGAAGCTTCTGACACTAC 

CCTCGTTGTGAAAGATGGGTATCAATGGAGGAAATATGGACAGAAAGTGACTAGAGACAA 

TCCATCTCCAAGAGCTTACTTCAAATGTGCTTGTGCTCCAAGCTGTTCTGTCAAAAAGAA 

GGTTCAGAGAAGTGTGGAGGATCAGTCCGTGTTAGTTGCAACTTATGAGGGTGAACACAA 

CCATCCAATGCCATCGCAGATCGATTCAAACAATGGCTTAAACCGCCACATCTCTCATGG 

TGGTTCAGCTTCAACACCCGTTGCAGCAAACAGAAGAAGTAGCTTGACTGTGCCGGTGAC 

TACCGTAGATATGATTGAATCGAAGAAAGTGACGAGCCCAACGTCAAGAATCGATTTTCC 

CCAAGTTCAGAAACTTTTGGTGGAGCAAATGGCTTCTTCCTTAACCAAAGATCCTAACTT 

TAC^GCAGCTTTAGCAGC^GCTGTTACCGGAAAATTGTATCAACAGAATCATACCGAGAA 

ATAGTTTAGCTTCAAATTCCGTTAGAGTTTTTAGATTTGAATTTGTCATGAGTAAGAGAA 

AGAGAGTAGATTATAATCCNTTGTGATACTGAAAAAAAAAAAAAAAAAAA 

>G921 Amino Acid Sequence (domain in AA coordinates: 146-203) 

MDQYSSSLVDTSLDLTIGVTRMRV^ 

LRKQLMEYWKSNITERDQISPPKKRKSPAREDAFSCAVIGGVSESSSTDQDEYLCKKQR 

EETVVKEKVSRVYYKTEASDTTLVVXDGYQWR]^ 

lOCVQRSV^DQSVLVATYEGEHiraPMPSQIDS*^ 

VTTVDMIESKKVTSPTSRIDFPQVQKLLVEQMASSLTKDPNFTAALAAAV 
EK* 

>G922 (1..1449) 

ATGGTGGCTATGTTTCAAGAAGATAATGGAACATCTTCTGTAGCTTCATCACCACTTCAA 
GTCTTCTCAACTATGTCACTCAACAGACCGACTCTCCTCGCTTCTTCATCTCCGTTTCAT 
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TGTCTCAAAGATCTCAAACCAGAGGAGCGTGGTCTCTACTTAATCCACCTCTTGCTAACT 

TGTGCCAACCACGTGGCTTCAGGTAGCCTCCAAAACGCTAACGCAGCGCTCGAGCAGCTC 

TCTCACCTCGCTTCTCCTGACGGCGACACGATGCAGCGAATCGCTGCTTACTTCACCGAA 

GCGCTTGCTAACAGAATCCTTAAGTCCTGGCCTGGTCTTTACAAGGCTCTTAACGCAACT 

CAGACAAGAACTAACAATGTCTCTGAGGAGATTCATGTTAGAAGACTCTTCTTTGAGATG 

TTCCCGATACTCAAAGTCTCTTACTTGCTCACTAATCGAGCTATACTCGAGGCTATGG 

GGAGAGAAGATGGTTCATGTGATTGATCTCGATGCTTCTGAGCCAGCTCAATGGCTTGCT 

TTGCTTCAAGCTTTTAACTCTAGGCCTGAAGGTCCACCTCATTTGAGAATCACTGGTGTT 

CATCACCAGAAGGAAGTGCTTGAACAAATGGCTCATAGACTCATTGAGGAAGCAGAGAAA 

CTCGATATCCCGTTTCAGTTTAATCCCGTTGTGAGTAGGTTAGACTGTTTAAATGTAGAA 

CAGTTGCGGGTTAAAACAGGAGAGGCCTTAGCCGTTAGCTCGGTTCTTCAATTGCATACC 

TTCTTGGCCTCTGATGATGATCTCATGAGAAAGAACTGCGCTTTACGGTTTCAGAACAAC 

CCTAGTGGAGTTGACTTGCAGAGAGTTCTAATGATGAGCCATGGCTCTGCAGCTGAGGCA 

CGTGAGAATGATATGAGTAACAACAATGGGTATAGCCCTAGCGGTGACTCGGCCTCATCT 

TTGCCTTTACCAAGTTCAGGAAGGACTGATAGCTTCCTCAATGCTATTTGGGGTTTGTCT 

CCAAAGGTCATGGTGGTCACTGAGCAAGACTCAGACCACAACGGCTCCACACTAATGGAG 

AGGCTATTAGAATCACTTTACACCTACGCAGCATTGTTTGATTGCTTGGAAACAAAAGTT 

CCAAGAACGTCTCAAGATAGGATCAAAGTGGAGAAGATGCTCTTCGGGGAGGAGATCAAG 

AACATC^TATCCTGCGAGGGATTTGAGAGAAGAGA7^GACACGAGAAGCTTGAGAAATGG 

AGCCAGAGGATCGATTTGGCTGGTTTTGGGAATGTTCCTCTTAGCTATTATGCGATGTTG 

CAGGCTAGGAGATTGCTTCAAGGGTGCGGTTTTGATGGGTATAGAATCAAGGAAGAGAGC 

GGGTGCGCAGTAATTTGCTGGCAAGATCGACCTCTATACTCGGTATCAGCTTGGAGATGC 

AGGAAGTGA 

>G922 Amino Acid Sequence (conserved domain in AA coordinates : 225- 

MVAMFQEDNGTSSVASSPLQVFSTMSLNRPTLLASSSPFHCLKDLKPEERGLYLIHLLLT 

CANHVASGSLQNANAALEQLSHLAS PDGDTMQRI AAYFTEALANRI LKSWPGLYKALNAT 

QTRTNlTVSEEIHVI^LFFEMFPILKVSYLLTi^^ 

LLQAFNSRPEGPPHLRITGVHHQKEVLEQ 

QLRVTCTGEALAVSSVXQLHTFIiASDDDLMRKNCALRFQN^ 

RENDMSNl^GYSPSGDSASSLPLPSSGRTDSFLNAIWGLSPKVMvVTEQDSDHNGSTLME 
RLLESLYTYAALFDCLETK\^RTSQDRIKVEKMLFGEEIKNIISCEGFERRERHEKLEKW 
SQRIDLAGFGNVPLS YYAMLQARRLLQGCGFDGYR I KEESG CAVI CWQDRPLYSVSAWRC 
RK* 

>G932 (206.. 1213) 

CCACGCGTCCGACCACTTGTACCTCTTTGTCTTAAGTACTCTTTAACCCTACAATTTCCT 
AAGCTCTCAAGCCACAAAAAACCACAAACCGTTCTTCACCAATATATATATCTGATCATC 
ATCAAAGTCCTTCTCTCTGCTCATACCACAAACCGTTCCATTCTTCCCCTAATCACAAAG 
TGATATTTACATAGAGAAGATAGAGATGGGAAGACCACCATGCTGTGACAAGATTGGAGT 
GAAGAAAGGACCATGGACACCAGAGGAAGATATCATCTTGGTTTCTTACATCCAAGAACA 
TGGTCCTGGAAACTGGAGATCTGTGCCTACTCACACAGGTTTGAGGAGATGTAGCAAAAG 
CTGTAGATTGAGGTGGACTAATTATCTTCGACCTGGGATCAAGCGTGGAAATTTCACCGA 
GCATGAAGAGAAGATGATTCTCCATCTTCAAGCTCTTTTGGGAAACAGGTGGGCAGCTAT 
AGCATCATATCTTCCAGAAAGGACAGACAATGATATAAAGAACTATTGGAACACTCATTT 
GAAGAAAAAGCTCAAGAAGATGAATGATTCTTGTGATAGTACTATCAACAATGGCCTTGA 
TAATAAAGACTTCTCCATATCAAACAAAAACACTACCTCACATCAAAGCAGCAACTCCAG 
TAAAGGTCAATGGGAGAGAAGGCTTCAGACAGATATCAACATGGCTAAACAAGCTCTTTG 
TGATGCCTTGTCTATTGAOU^CCACAAAACCCAACTAATTTTTCTATTCCCGATCTTGG 
TTATGGTCCATCATeTTCTTCGTCCTCTACCACCACCACCACCACCACCACCACCACGAG 
AAAC^CTAATCCATACCCATCTGGGGTCTATGCTTCAAGTGCTGAGAACATTGCTCGTTT 
GCTTCAGAATTTTATGAAAGACACACCAAAGACCTCGGTGCCCTTGCCGGTTGCAGCCAC 
CGAGATGGCTATCACCACGGCAGCTTCGAGCCCTAGCACAACCGAAGGAGACGGAGAAGG 
GATTGACCATTCTTTGTTC^GCTTCAACTCCATAGATGAAGCTGAAGAGAAGCCTAAACT 
AATAGAC(^TGACATTAATGGTCTAATTACACAAGGCTCTCTTTCTTTGTTCGAGAAATG 
GCTCTTTGATGAGCAAAGCCACGATATGATCATCAATAACATGTCACTAGAGGGTCAGGA 
AGTGTTGTTCTAGAAAGCATTAAAGTTTGACGATTTGCTTGAGGAACCACGAGGCTTAGT 
TATAAACAATTTGTATAATTAAGTACTCTTTAGTTTTGTTTTCAATCCTTATTATGATCA 
TATTGCAGTAATTAGGGATTTTAGTCTTTAGTAGTAACTCTTAAGTTTTAACACATTTTT 
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CTCTATCTTTTTAGTAGTAACTCTTTATTTTTTCCTTAAATCTTTC 
ATATCTTCTATGTAGTAGAAACTCAAAAGTGTACATCATCTTTATTAATGTAACGTCTTT 

TTAAAAAAAAAAAAAAAAA 

>G932 Amino Acid Sequence (domain in AA coordinates: 12-118) 
MGRPPCCDKIGVKKGPWTPEEDIILVSYIQEHGPGNWRSVPTHTGLRRCSKSCRLRWTl^ 
LRPGIKRGNFTEHEEKMILHLQALLGNRWAAIASYL^ 

DSCDSTll^GLDNKDPSISNKNTTSHQSSNSSKGQWERRLQTDINMAKQALCDAIiSIDKP 
QNPTNFSIPDLGYGPSSSSSSTTTTTTTTTTRNTNPYPSGWM 

PKTSVPLPVAATEMAITTAASSPSTTEGDGEGIDHSLFSFNSIDEAEEKPKLIDHDINGL 
ITQGSLSLFEKWLFDEQSHDMI INNMSLEGQEVLF* 
>G599 (152.. 1579) 

TCGACAGAACAGCTTCGTTGTCACTTGTCATTCTATAAATCGCATCCCCATTGACAACCT 
TTCACTTCCATCAAAACTCTCTCTCTATATCTCTCTCTCTATATATCTCTCTCTATATCT 
CTCTCTCTCTTGACTCTCTCTTTCTTTC^^ 

ACCCGACCCGGTTTACCGTCCT^CCGGAAACACCACTCGAACCGATGGAGTTTTTAGCTCG 

TTCATGGAGCGTCTCTGCTCTCGAAGTCTCCAAGGCTCTAACACC^CCC^CCCTCAGAT 

TCTCCTCTCCAAAACCGAAGAAGAAGAAGAAGAAGAACCCATCTCCTCTGTCGTAGACGG 

CGACGGCGAC^CGGAAGACACCGGACTTGTCACCGGAAACCCATTCTCCTTCGCTTGTTC 

AGAAACTTCTCAAATGGTCATGGATCGTATCTTGTCTCACTCTC^GAAGTATCACC^AG 

AACATCTGGTCGGCTATCTCACAGTAGTGGTCCACTTAATGGTTCTTTGACCGACAGTCC 

TCCTGTGTCTCCTCCCGAATCCGACGACATTAAGCAATTTTGCAGAGCGAACAT^AAATTC 

ATTGAACAGTGTAAATTCTCAGTTCCGTTCAACGGCGGCAACTCCGGGACCTATAACCGC 

TACAGCTACACAGTCC^GACGGTGGGACGGTGGCTTAAGGACCGGAGAGAGAAAAAGAA 

AGAGGAGACTCGGGCTCATAACGCTCAGATTCACGCTGCTGTCTCTGTCGCCGGCGTTGC 

TGCAG(^TTGCTGCTATTGCAGCAGC(^CCGCrGCGTCTTCTAGCTGTGGTAAGGATGA 

GCAGATGGCTAAT^CTGACATGGCCGTTGCTTCTGCTGCGACCCTTGTGGCTGCTCAGTG 

TGTGGAAGCTGCTGAAGTTATGGGAGCTGAGAGAGAGTATTTGGCTTCTGTTGTTAGCTC 

CGCCGTCAATGTTCGTTCTGCCGGAGATATTATGACTCTCACCGCCGGAGCAGCTACAGC 

TTTAAGAGGAGTGCAAACATTGAAGGCAAGGGCAATGAAGGAAGTGTGGAACATAGCATC 

AGTGATACOVATGGATAAAGGACTCACTTCTACAGGAGGAAGCAGCAATAATGTTAATGG 

TAGCAATGGAAGCTC^GCAGTAGT<^CAGTGGTGAACTTGTACAACAGGAGAATTTCTT 

GGGAACTTGTAGTAGAGAATGGCTCGCTAGAGGTTGTGAACTCCTCAAACGCACTCGCAA 

AGGTGATCTCCACTGGAAGATAGTATCTGTTTACATCAACAAAATGAATCAGGTTATGTT 

GAAGATGAAGAGCAGGCATGTTGGAGGAACCTTCACCAAGAAGAAAAAGAACATTGTGCT 

TGATGTGATCAAGAATGTCCCGGCCn'GGCCTGGACGACATTTGCTAGAGGGAGGAGATGA 

TCTAAGATACTTCGGTTTGAAGACGGTTATGCGAGGTGATGTTGAATTCGAGGTCAAGAG 

CCAAAGGGAATATGAAATGTGGACACAAGGTGTCTCAAGGCTTCTTGTTCTTGCTGCTGA 

GAGGAAGTTTAGGATGTGAATAAACGTTCAATGGCTGCTTGGTTTAAGTGTGAGTTTTTT 

TTTAACTTATGTGGTCAAATTTCATTAGTAGGGGTTCTTTTAAGGTAATGGTTTTTTGGG 

TTGGGTATAGGATAAAATGGACCTACCAGTCAAGGTGAGGAAGCATTTGGGTAAACAAAA 

CTTAGTGGGGGTGATCnXSTAATATCTATGTTCTTAGTTTTTTTTTGGTTGTTGGTGGTCT 

TTTTGTATAAAAAAACAAAGTTGAAGTAATAGATATATAGTATGTTTTAATTTTAAA 

>G599 Amino Acid Sequence (domain in AA coordinates: 187-219, 264-300) 

MEKLMVPTWRPDPVYRPPETPLEPMEFLARSWSVSALEVSKALTPPNPQILLSKTEEEEE 

EEPISSWDGDGDTEDTGLVTGNPFSFACSETSQMVMDRILSHSQEVSPRTSGRLSHSSG 

PLNGSLTDSPPVSPPESDDIKQFCRANKNSLNSVNSQFRSTAATPGPITATATQSKTVGR 

WLKDRREKKKEETRAHNAQ IHAAVSVAGVAAAVAAIAAATAAS SS CGKDEQMAKTDMAVA 

S AATLVAAQCVEAAEVMGAEREYLASVVS S AVNVRSAGD IMTLTAGAATALRGVQTIjKAR 

AMKEVWNIASVIPMDKGLTSTGGSSMNVNGSNGSSSSSHSGELVQQENFLGTCSREWIAR 

GCELLKRTRKGDLHWKIVSVYINKMNQVMLKMKSR^ 

GRHLLEGGDDLRYFGLKTVMRGDVEFEVKSQREYEMWTQGVSRLLVIiAAERKFRM* 
>G804 (114.. 1139) 

ATACTCCAAGAATTTATAGGTTATAAGTAAT^AATTCAGTACAAGTTTGTTTGTTTGTTTA 
TTCCATTTTCTTGTGTGTTTTTTTCCCCATAATTTATAAATTTTATAAGCAATATGGAGT 
CCCACAAC^CAACCTVGAGCAACAACAACACCACTGGTTCGGCCC^TCTGGTCCCATCC^ 
TGGGACCAATCTCCGGTTCAGTCTCATTAACCACCACTGCTCCAAACTCCACTACCACCA 
CCGTCACCGCCGCTAAAACACCCGCAAAACGACCGTCCAAGGACCGTCACATCAAAGTAG 
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ACGGACGTGGCCGGAGGATACGTATGCCGGCTATCTGCGCAGCACGTGTCTTCCAACTAA 
CACGTGAGTTACAACACAAATCGGACGGCGAGACTATAGAGTGGCTGCTCCAACAAGCGG 
AGCCAGCTATCATCGCAGCCACCGGAACTGGAACCATACCGGCGAATATCTCTACTTTGA 
ACATCTCTCTTCGAAGCAGTGGCTCTACTCTTTCAGCTCC^CTC 

TGGGAAGAGCGGCTCAAAACGCTGCCGTTTTTGGGTTCCAGCAACAGCTTTATCATCCTC 
ATCATATCACGACAGATTCTTCTTCTTCTTCTCTTCCCAAAACATTCCGTGAAGAAGATC 
TTTTTAAAGATCCTAATTTTCTAGATCAAGAACCCGGTTCAAGATCACCTAAACCGGGAT 
CCGAAGCTCCTGATCAAGATCCGGGTTCGACCCGGTCAAGAACACAAAATATGATACCGC 
CGATGTGGGCACTAGCGCCAACGCCAGCCTCCACAAACGGAGGTAGTGCTTTTTGGATGT 
TACC7VGTCGGAGGAGGAGGAGGTCCGGCTAACGTTCAGGATCCATCACAGCACATGTGGG 
CGTTTAATCCGGGTCATTACCCGGGTCGAATCGGGTCGGTTCAGCTAGGGTCTATGTTAG 
TGGGAGGTCAACAGTTAGGGTTAGGTGTTGCAGAAAATAACAATTTGGGGCTATTTTCCG 
GCGGAGGAGGAGACGGTGGTCGGGTTGGTCTCGGAATGAGTCTTGAGCAAAAGCCTCAAC 
ATCAAGTGAGTGATCATGCTACTAGAGACCAAAATCCTACTATAGATGGTTCTCCTTGAA 
AGACTTCATGATTTCTTTGGTTTTTAAAAAGTGTGAATGTGTGATTTATTCCAACTTTTG 
TTGAGGACTCCAATGTTAATATGGGTTTTAGGGTTGGCTTTTCGGGATT 

ATT 

>G804 Amino Acid Sequence (domain in AA coordinates: 54-117) 

MESHNNNQSNmTTGSAHLVPSMGPISGSVSLTTTAPNSTTTTVTAAK 

KVDGRGRRIRMPAI CAARVFQLTRELQHKSDGETI EWLiLQQAEPAI IAATGTGTI PAN I S 

TIiNISLRSSGSTLSAPLSKSFHMGRAAQNAAVFGFQQQLYHPHHITTDSSSSSLPKTFRE 

EDLFKDPNFLDQEPGSRSPKPGSEAPDQDPGSTRSRTQNMIPPMWALAPTPASTNGGSAF 

WMLPVGGGGGPANVQDP'SQHMWAFNPGHYPGRI^^ 

FSGGGGDGGRVGLGMSLEQKPQHQVSDHATRDQNPTIDGSP* 

>G1062 (297. .1781) 

CAAAAAAAAAGTTTCAATTTTTGAAAGCTCTGAGAAATGAAATCTATC^TTCTCTCTCTC 
TATCTCTATCTTCCTTTTGAGATTTCGCTTCTTCAATTCATGAAATCCTCGTGATTCTAC 
TTTAATGCTTCTCTTTTTTTACTTTTCCAAGTCTCTGAATATTCAAAGTATATATCTTTT 
GTTTTCAAACTTTTGCAGAATTGTCTTCAAGCTTCCAAATTTCAGTTAAAGGTCTCAACT 
TTGCAGAATTTTCCTCTAAAGGTTCAGACTTTGGGGTAAAGGTGTCAACTTTGGCGATGG 
GTCTTGACGGAAACAATGGTGGAGGGGTTTGGTTAAACGGTGGTGGTGGAGAAAGGGAAG 
AGAACGAGGAAGGTTCATGGGGAAGGAATCAAGAAGATGGTTCTTCTCAGTTTAAGCCTA 
TGCTTGAAGGTGATTGGTTTAGTAGTAACCAACCACATCCACAAGATCTTCAGATGTTAC 
AGAATC^GCCAGATTTCAGATACTTTGGTGGTTTTCCTTTTAACCCTAATGATAATCTTC 
TTCTTCAACACTCTATTGATTCTTCTTCTTCTTGTTCTCCTTCTCAAGCTTTTAGTCTTG 
ACCCTTCTCAGCAAAATCAGTTCTTGTCAACTAACAACAACAAGGGTTGTCTTCTCAATC 
TTCCTTCTTCTGCAAACCCTTTTGATAATGCTTTTGAGTTTGGCTCTGAATCTGGTTTTC 
TTAACCAAATCCATGCTCCTATTTCGATGGGGTTTGGTTCT^ 

GGGATTTGAGTTCTGTTCCTGATTTCTTGTCTGCTCGGTCACTTCTTGCGCCGGAAAGCA 
ACAACAACAACACAATGTTGTGTGGTGGTTTCACAGCTCC 

GTAGTCCTGCTAATGGTGGTTTTGTTGGGAACAGAGCGAAAGTTCTGAAGCCTTTAGAGG 
TGTTAGCATCGTCTGGTGCACAGCCTACTCTGTTCCAGAAACGTGCAGCTATGCGTCAGA 
GCTCTGGAAGCAAAATGGGAAATTCGGAGAGTTCGGGAATGAGGAGGTTTAGTGATGATG 
GAGATATGGATGAGACTGGGATTGAGGTTTCTGGGTTGAACTATGAGTCTGATGAGATAA 
ATGAGAGCGGTAAAGCGGCTGAGAGTGTTCAGATTGGAGGAGGAGGAAAGGGTAAGAAGA 
AAGGTATGCCTGCTAAGAATCTGATGGCTGAGAGGAGAAGGAGGAAGAAGCTTAATGATA 
GGCTTTATATGCTTAGATCAGTTGTCCCCAAGATCAGC7yVAATGGATAGAGCATCAATAC 
TTGGAGATGCAATTGATTATCTGAAGGAACTTCTACAAAGGATCAATGATCTTCACAATG 
AACTTGAGTCAACTCCTCCTGGATCTTTGCCTCCAACTTCATCAAGCTTCCATCCGTTGA 
CACCTACACCGCAAAGTCTTTCTTGTCGTGTCAAGGAAGAGTTGTGTCCCTCTTCTTTAC 
CAAGTCCTAAAGGCCAGCAAGCTAGAGTTGAGGTTAGATTAAGGGAAGGAAGAGCAGTGA 
ACATTCATATGTTCTGTGGTCGTAGACCGGGTCTGTTGCTCGCTACCATGAAAGCTTTGG 
ATAATCTTGGATTGGATGTTCAGCAAGCTGTGATCAGCTGTTTTAATGGGTTTGCCTTGG 
ATGTTTTCCGCGCTGAGC^TGCCAAGAAGGAC^GAGATACTGCCTGATCAAATa^AG 
CAGTGCTTTTCGATACAGCAGGGTATGCTGGTATGATCTGATCTGATCCTGACTTCGAGT 
CCATTAAGCATCTGTTGAAGCAGAGCTAGAAGAACTAAGTCCCTTTAAATCTGCAATTTT 
CTTCTCAACTTTTTTTCTTATGTCATAACTTCAATCTAAGCATGTAATGCAATTGC^AAT 
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GAGAGTTGTTTTTAAATTAAGCTTTTGAGAACTTGAGGTTGTTGTTGTT 

TTCAACCTTTTATTAGCAATGTTAACTTCCATTTATGTTTCATCTT 

>G1062 Amino Acid Sequence (domain in AA coordinates: 308-359) 

MGLDGNNGGGVWLNGGGGEREENEEGSWGRNQEDGSSQFKPMLEGDWFSSNQPHPQDLQM 

LQNQPDFRYFGGFPFNPNDNLLLQHSIDSSSSCSPSQAFSLDPSQQNQFLSTNWNKGCLL 

NVPSSANPFDNAFEFGSESGFLNQIHAPISMGFGSLTQLGNRDLSSVPDFLSARSLLAPE 

SNftTCtfNTMIiCGGFTAPLELEGFGSPANGGFVGNRAICVL 

QSSGSKMGNSESSGMRRFSDDGDMDETGIEVSGLNYESDEINESGKAAESVQIGGGGKGK 
KKGMPAKNLMAERRRRKKLNDRLYMLRS WPKI S KMDRAS I LGDAIDYLKELLQRINDLH 
NELESTPPGSLPPTS S SFHPLTPTPQTLSCRVKEELCPSSLPS PKGQQARVEVRLREGRA 
VN I HMFCGRRPGLLLATMKALDNLGLDVQQAVT S CFNGFALD VFRAEQCQEGQE I LPDQ I 
KA VLFDTAG YAGM I * 
>G1322 (213.. 833) 

T^AAGTTATTGATAGTTTCTGTTACTTATTAATTTTTAAGGTTATGTGTATTATTACCAAT 

TGGAGGACTATATAGTCGCAAGTCTCAACCCTATAAAAGAAAACATTCGTCGATCATCTT 

CCCGCCTCGAGTATCTCTCTCTCTCTCTCTCTTCTCTGTTTTCTTTA1TGATTGCATAGA 

CAAAAATACACACATACACAACAGAAAGAAAGATGGAGACGACGATGAAGAAGAAAGGGA 

GAGTGAAAGCGACAATAACGTCACAGAAAGAAGAAGAAGGAACAGTGAGAAAAGGACCTT 

GGACTATGGAAGAAGATTTCATCCTCTTTAATTACATCCTTAATCATGGTGAAGGTCTTT 

GGAACTCTGTCGCCAAAGCCTCTGGTCTAAAACGTACTGGAAAAAGTTGTCGGCTCCGGT 

GGCTGAACTATCTCCGACCAGATGTGCGGCGAGGGAACATAACCGAAGAAGAACAGCTTT 

TGATCATTCAGCTTCATGCTAAGCTTGGAAACAGGTGGTCGAAGATTGCGAAGCATCTTC 

CGGGAAGAACGGACAACGAGATAAAGAACTTCTGGAGGACAAAGATTCAGAGACACATGA 

AAGTGTCATCGGAAAATATGATGAATCATCAACATCATTGTTCGGGAAACTCACAGAGCT 

CGGGGATGACGACGCAAGGC^GCTCCGGCAAAGCC^TAGACACGGCTGAGAGCTTCTCTC 

AGGCGAAGACGACGACGTTTAATGTGGTGGAACAACAGTCAAACGAGAATTACTGGAACG 

TTGAAGATCTGTGGCCCGTCCACTTGCTTAATGGTGACCACCATGTGATTTAAGATATAT 

ATATAGACCTCCTATACATTTATATGCCCCAGCTGGGTTTTTTTGTATGGTACGTTATTT 

GGTTTTTCTATTGCTGAAATGTCGTTG CATTTAATTTACATAC GAAAAGTGCATTAAATC 

ATTAAATCTTCAATACATATGGAGGTGGTGTTTGAGTAAAAAAAAAAAAAA 

>G1322 Amino Acid Sequence (domain in AA coordinates : 26-130) 

METTMKKKGRVKATITSQKEEEGTVRKGPW 

RTGKS CRLRWLNYLRPDVRRGNITEEEQLLI I QLHAKLGNRWS KIAKHLPGRTDNE I KNF 
WRTKJCQRHMKVSSENMMNHQHHCSGNSQSSGMTTQGSSGKAIDTAESFSQACT 
QQSNENYWNVEDLWPVHLLNGDHHVI * 
>G1331 (1..786) 

ATGGTGGAAGAAGTTTGGAGAAAGGGTCCATGGACCGCCGAAGAAGACCGTCTTTTGATC 
GAATACGTCCGTGTTCACGGTGAAGGTCGTTGGAACTCTGTCTCTAAACTCGCAGGATTG 
AAAAGGAATGGCAAAAGCTGCAGACTAAGATGGGTGAATTACCTTAGACCAGACCTCAAG 
AGAGGACAGATCACTCCACATGAAGAAAGTATAATACTTGAGCTACACGCTAAGTGGGGA 
AATAGGTGGTCAACAATTGCACGTAGTTTACCAGGAAGAACAGACAATGAGATCAAGAAC 
TATTGGAGAACCCATTTCAAGAAAAAGGCAAAGCCTACGACTAACAATGCGGAGAAGATA 
AAGAGTCGTCTCCTAAAAAGGCAACACTTCAAGGAACAGAGAGAAATAGAGTTGCAACAA 
GAAGAGCAGTTGTTTCAGTTCGACCAACTCGGTATGAAAAAGATCATCTCTTTACTCGAA 
GAAAACAATAGCAGTAGCAGTAGCGATGGCGGTGGTGATGTGTTCTATTATCCTGATCAA 
ATAACACATTCATCAAAACCCTTTGGCTATAACTCTAATTCATTAGAGGAGCAGTTACAA 
GGTAGATTTTCTCCTGTAAACATACCTGATGCTAATACTATGAACGAAGACAATGCCATA 
TGGGACGGGTTTTGGAACATGGATGTTGTAAATGGACATGGTGGGAACTTGGGTGTTGTG 
GCTGCTACTGCTGCTTGTGGCCCAAGGAAGCCCTATTTCCATAACTTGGTGATTCCATTT 
TGTTAA 

>G1331 Amino Acid Sequence (conserved domain in AA coordinates : 8-109) 

MVTSEVWKGPWTAEEDRLLIEYVRVHGEGRWN^ 

RGQITPHEESIILELHAKWGNRWSTIARSLPGRTDNEIKN^^ 

KSRIjLKRQHFKEQRE I ELQQEQQLFQFDQLGMKKI I SLLEENNS S S S SDGGGDVF Y YPDQ 
ITHSSKPFGYNSNSLEEQLQGRFSPWIPDAimiNEDNAI^ 
AATAACGPRKPYFHNLVI PFC* 
>G1521 (1..891) 
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ATGCCTCCATTACCGTCCTCCACGGCGCCTTCXjTCITCGAGACATCTTCGATCGCCGGAA 

AGTATCGCGAAATTTGCAGGGAGAGCAATATTTCCTGCTTTACAGGGGAAATCGTGTCCG 

ATATGCCTCGAAAATCTAACCGAGCGAAGATCCGCCGCCGTGATCACGGTGTGCAAGCAC 

GGATACTGCCTTGCTTGTATTCGGAAGTGGAGCAGCTTCAAGAGGAATTGTCCTCTTTGT 

AACACTCGTTTTGATTCCTGGTTTATCGTTAGTGATTTTGCrrTCTAGAA 

GAGCAATTACCAATTCTTCGTGATCGTGAGACTTTAACTTATCATCGGAATAATCCTTCC 

GATCGCCGGAGGATAATTCAAAGGTCGAGGGATGTTTTGGAAAACTCTAGCTCAAGATCA 

AGGCCATTGCCATGGCGGAGATGATTTGGACGACCAGGTTCAGTTCCTGATTCTGTTATC 

TTCCAGCGAAAGCTTCAGTGGCGAGCTAGCATATACACTAAGCAATTACGAGCTGTTCGA 

TTACATTCAAGGCGCTTGGAACTAAGTTTGGCGGTGAATGATTACACCAAAGCAAAGATA 

ACTGAAAGAATTGAGCCATGGATTAGAAGAGAGCTTCAGGCAGTCCTTGGAGATCCTGAT 

CCCTCAGTTATTGTTCkTTTTGCGT 

AATCGACAAACCGGGCAGACCGGGATGTTGGTGGAAGATGAAGTCTCCTCTCTTCGAAAA 

TTCTTGTCTGATAAGGTGGATATATTTTGGCATGAACTAAGATGTTTTGCGGAGAGTATA 

CTCACGATGGAGACTTATGATGCAGTGGTTGAATACAATGAGGTGGAGTAA 

>G1521 Amino Acid Sequence (domain in AA coordinates: 39-80) 

MPPLPSSTAPSSSRHLRSPESIAKFAGRAIFPALQGKSCPICLENIiTERRSAAVITVCKH 

GYCLACIRKWSSFKRNCPLCNTRFDSWFIVSDFASR^ 

DRRRIIQRSRDVLENSSSRSRPLPWRRSFGRPGSVPDSVIFQRKLQWRASIYTKQLRAVR 
LHSRRLELSLAVNDYTKAKITERIEPWIRRELQAVLGDPDPSVIVHFASALFIKRLEREN 
NRQTGQTGMLVEDEVS SLRKFLS DKVDI FWHELRCFAES ILTMETYDAVVE YNE * 
>G183 (1..1458) 

ATGAGTGATTTTGATGAAAACTTCATCGAAATGACGTCGTATTGGGCTCCACCATCCAGT 
CCTAGCCCAAGAACGATATTGGCAATGCTGGAGCAT^ACCGACAATGGTCTGAATCCAATC 
AGTGAGATCTTCCCTCAAGAAAGCTTGCCAAGAGATCATACTGATCAATCTGGAC!AAAGA 
TCTGGTCTTCGTGAGAGACTGGCTGCAAGAGTAGGATTCAATCTTCCAACACTCAATACA 
GAAGAAAACATGAGTCCTTTGGATGCATTTTTCAGGAGCTCGAATGTTCCTAATTCTCCT 
GTCGTTGCAATCrrCrCCAGGATTCAGTCCATCAGCACTAT^ 

AGTGATTCTTCCCAGATTATCCCTCCGTCTTCAGCCACCAATTACGGACCTCTAGAGATG 
GTGGAAACTTCCGGTGAAGACAATGCAGCGATGATGATGTTCAACAACGATCTTCCTTAT 
CAGCCGTACAATGTTGATCTGCCTTCTCTAGAAGTCTTTGATGATATTGCAACGGAAGAG 
TCCTTTTATATCCCATCTTATGAACCTCATGTTGACCCAATTGGAACTCCTTTAGTCACA 
TCCTTTGAATCTGAACTCGTTGACGATGCCCATACCGACATCATCTCCATTGAGGACAGT 
GAGAGCGAGGATGGAAACAAAGATGATGACGACGAGGACTTCCAATACGAAGACGAAGAC 
GAAGACCAATACGACCAAGATCAAGATGTAGATGAAGATGAAGAGGAAGAAAAAGATGAA 
GACAATGTTGCATTAGATGATCCTCAACCTCCACCTCCAAAGAGAAGGAGATATGAGGTA 
TCAAACATGATTGGAGCC^CAAGAACAAGC^ 

AGCGACGAAGACAATCCTAACGATGGTTATCGCTGGAGAAAATACGGTCAGAAAGTCGTC 
AAAGGAAATCCTAATCCGAGGAGTTACTTCAAGTGCACAAACATCGAGTGCAGAGTGAAA 
AAACATGTGGAGAGAGGAGCAGACAATATCAAGTTGGTTGTGACTACATACGATGGGATA 
(^CAACCATCCTTCACCACCTGCACGTAGAAGCAATTCCAGTTCAAGGAACCGGTCTGCA 
GGGGCAACAATACCTCAAAATCAGAATGATCGAACCAGTCGGTTAGGTAGGGCTCCTCCT 
ACTCCTACTCCTCCTACTCCTCCTCCTTCGTCTTACACACCTGAGGAGATGAGGCCTTTC 
TCTTCGTTGGCTACAGAAATTGATCTGACAGAGGTTTATATGACCGGAATCTCTATGCTG 
CCGAATATACCGGTTTACGAGAATTCGGGTTTTATGTACCAGAATGATGAACCGACGATG 
AATGCGATGCCGGATGGTTCAGATGTGTACGATGGGATCATGGAACGCCTGTATTTTAAG 
TTTGGTGTCGACATGTAG 

>G183 Amino Acxd Sequence (domain in AA coordinates: TBD) 
MSDFDENFIEMTSYWAPPSSPSPRTILAMLEQTDNGLNPISEIFPQESLPRDHTDQSGQR 
SGLRERLAARVGFNLPTLNTEENMS PLDAFFRSSNVPNS PWAIS PGFSPSALLHTPNMV 
SDSSQIIPPSSATNYGPLEIWETSGEDNAAMMMFNNDLPYQPY 

SFYIPSYEPHVDPIGTPIiVTSFESELVDDAHTDIISIEDSESEDGNKDDDDEDFQYEDED 

EDQYDQDQDVDEDEEEEKDEDNVALDDPQPPPPKRRRYEVSNMIGATRTSKTQRIILQME 

SDEDNPNDGYRWRKYGQKVVKGNPNPRSYFKCTNIECRVKKHVERGADNIKLVVTTro 

HNHPSPPARRSNSSSRNRSAGATIPQNQNDRTSRLGRAPPTPTPPTPPPSSYTPEEMRPF 

SSLATEIDLTEVYMTGISMLPNIPVYENSGFMYQNDEPTMNAMPDGSDVYDGIMERLYFK 

FGVDM* 
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>G2555 (177.. 956) 

CTGTTTTTGTATCCGTGTAAATTAATCACACGGTAGTTTTTGATC 

GAGAACAATCTGGTCTGCTGCTAAAATTTAATAAATTGTTTTGTCTAATTGTCTCCACCC 
ATAAAAAAGCGCGAATTCAATTCACCGACTAAAGACATTCTCCGGTGGAGACCCCGATGC 
AATCCACTCATATAAGCGGCGGAAGTAGCGGTGGTGGTGGTGGAGGAGGAGGAGAGGTGA 
GTCGAAGTGGATTATCTCGGATCCGTTCAGCTCCAGCTACT^GGATTGAAACCCTACTCG 
AAGAAGATGAAGAAGAAGGTTTAAAACCTAACCTTTGTTTAACAGAGCTGCTTACTGGTA 
ATAATAACTCTGGAGGAGTGATAACGAGTCGTGACGACTCGTTCGAGTTCCTGAGTTCTG 
TTGAGCAAGGATTGTATAATCATCATCAAGGTGGTGGCTTTCACCGTCAGAATAGTTCTC 
CGGCTGATTTTCTTAGTGGGTCTGGTTCTGGGACTGATGGGTATTTCTCTAATTTTGGTA 
TTCCGGCGAATTATGACTATTTGTCGACCAACGTTGATATTTCTCCGACTAAACGGTCTA 
GAGATATGGAAACACAGTTTTCTTCTCAGCTGAAAGAAGAGCAAATGAGTGGTGGGATAT 
CAGGAATGATGGATATGAACATGGACAAGATTTTTGAGGATTCAGTTCCTTGTAGGGTTC 
GTGCTAAACGTGGTTGTGCTACTCATCCTCGTAGCATTGCTGAACGGGTGAGAAGAACGC 
GAATAAGTG ATCGGATTAGGAGGCTG CAAG AG CTTGTTCCTAACATGGATAAG CAAACCA 
ACACTGCAGACATGTTGGAAGAAGCTGTGGAGTATGTGAAGGCTCTTCAAAGCCAGATCC 
AGG AATTG ACAG AG CAGCAGAAG AG ATG CAAATG CAAACCT AAAGAAGAACAATAATGT A 
TCCTTTAGGATTTGATATATCTGTATTTTATTTTTGTACTATCTAAAAATGGTGATGATC 
TGTTCGAAAATTCGAAACATGATCTTATATATTGAACTAGAAAAAATAGATATATATGAA 
TTTTAGCTGTAAAATTTTTGTACAATAAGGAGAAAAAGATTTAGAAGAGTCAATAAAAAG 
ATGATGTTTACAAGTCAAAAAAAAAAA 

>G2555 Amino Acid Sequence (domain in AA coordinates: 175-245) 
MQSTHISGGSSGGGGGGGGEVSRSGLSRIRSAPATWIETLLEEDEEEGLKPNLCLTELLT 
GNNNSGGVITSRDDSFEFIiSSVEQGLYNHHQGGGFHRQNSSPADFLSGSGSGTDGYFSNF 
GIPAimmiSTNVDISPTKRSRDMETQFSSQLK^^ 

VRAKRGCATHPRS I AERVRRTRI SDRI RRLQELVPNMDKQTNTADMLEE AVE YVKALQS Q 
I QKLTEQQKRCKCKPKEEQ * 
>G375 (53.. 1171) 

TCGACAAAAACTCTCACTCTCCCTCAAACTAAACAAACATACAGAACACAAT^ATGGGTCT 
CACTTCTCTTCAAGTTTGCATGGATTCTGATTGGCTCCAGGAATCCGAGTCATCAGGAGG 
AAGCATGTTAGACTCTTCAACGAATTCTCCGTGAGCAGCCGACATACTAGC^GCTTGCAG 
CACTAGACCACAAGCCTCGGCCGTGGCTGTAGCCGCTGCAGCTCTGATGGACGGTGGAAG 
GAGGCTGCGTCCACCTCACGACCATCCTCAAAAGTGTCCTCGTTGCGAGTCAACACATAC 
TAAGTTCTGTTACTACAATAACTACAGCCTCTCTCAGCCTCGTTACTTCTGCAAGACTTG 
TCGCCGTTACTGGACAAAAGGCGGAACTCTAAGGAATATTCCGGTTGGTGGTGGATGCCG 
TAAAAACAAGAAACCATCTTCCTCTAATTCCTCCTCCTCC^CTTCTTCCGGCAAAAAACC 
ATCCAA^TCGTTACCGCC^TACCTCTGATCTTATGGCTTTAGCACATTCTCATCAAAA 
TTACCAACATTCTCCTCTAGGGTTTTCACATTTTGGTGGGATGATGGGGTCTTACTCAAC 
TCCGGAGCATGGTAACGTTGGTTTCTTGGAGAGCAAGTATGGCGGTTTGCTTTCGCAGAG 
CCCTAGACCTATTGATTTCTTGGACAGTAAGTTTGATCTCATGGGAGTGAACAATGACAA 
CCTGGTCATGGTTAATCATGGAAGTAACGGAGATCATC^TCATCATCATAATCATCAC^T 
GGGTCTGAATCACGGTGTAGGTCTTAACAACAACAACAACAATGGTGGATTTAATGGGAT 
TTCTACGGGAGGCAATGGAAATGGTGGTGGTCTCATGGATATATCGACATGCCAAAGACT 
TATGCTATCTAATTATGATCATCACCATTACAATCATCAAGAAGATCATCAAAGGGTAGC 
AACAATAATGGATGTGAAGCCAAATCCGAAGTTGTTATCGCTTGATTGGCAGCAAGATCA 
ATGCTACTCCAATGGTGGTGGTAGCGGAGGCGCAGGAAAATCCGACGGTGGTGGATACGG 
CAATGGTGGTTATATCAACGGTTTAGGTTCGTCGTGGAATGGTTTGATGAATGGCTATGG 
AACGTCCACTAAAA€AAACTCCTTGGTTTGATAAGTTAATCAGAACTTCTTTTTTCTTGT 
CGTCATCAACTAGTAGTAGTAGTAATAGTAGTTGGAGACTAGAGAAGCACTTCAAATTAT 
TTATGGGTTTGTTTGCTAAGCCAGTTTTAC 

>G375 Amino Acid Sequence (domain in AA coordinates: 75-103) 
MGLTS LQVCMDSDWLQE SES S GGSMLDSSTNS PS AAD I LAACSTRPQAS AVAVAAAALMD 
GGRRLRPPHDHPQKCPRCESTHTKFCYYNNYSLSQPRYFCKTCRRYWTKGGTIiRNIPVGG 
GCRKNKKPSSSNSSSSTSSGKKPSNIWANTSDLMALAHSHQNYQHSPLGFSHFGGMMGS 
YSTPEHGIWJFLESKYGGLLSQSPRPIDFLDSKFDLMGVNND^ 

HHMGLNHGVGLNNNNNNGGFNGI STGGNGNGGGLMD I STCQRLMLSNYDHHHYNHQEDHQ 
RVATIMDVKPNPKLLSLDWQQDQCYSNGGGSGGAGKSDGGGYGNGGYINGLGSSWNGLMN 
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GYGTSTKTNSLV* 

>G1007 (86.. 763) 

ATTCCTTCTTGCCTAGGAACTAATTGTTGCACACT^ 

CGACATCAAAACGAGAGAGAAAAGAATGGTGGATTCTC^TGGCTCCGACACGGAATGTTC 
CTCCAAGAAGAAAAAGGAGAAAACGAAAGAAAAGGGGGTATATCGTGGGGCTCGCATGAG 
GAGCT<3GGGGAAATGGGTCTCGGAGATTCGGGAGCCCCGTAAGAAATCAAGAATCTGGCT 
CGGGACTTTCCCCACGGCGGAGATGGCAGCGCGTGCCCATGATGTTGCGGCATTGAGTAT 
CAAAGGAAGTTCCGCAATCCTTAACTTCCCTGAGCTCGCGGATTTTCTGCCAAGACCAGT 
CTCGCTCAGCCAAC^GGATATCCAGGCCGCAGCCGCCGAAGCCGCTCTTATGGATTTCAA 
AACTGTACCATTCCATCTTCAGGATGACTCAACGCCGTTGCA7UVCTAGGTGTGATACTGA 
GAAGATCGAAAAGTGGTCATCCTCATCGTCCTCAGCCTCATCCTCATCCTCATCTTCGTC 
CTCGTCCTCATCATCTATGCTTTCGGGGGAGCTAGGAGATATTGTGGAGTTGCCGAGTCT 
TGAAAACAATGTAAAATACGATTGTGCGCTGTATGACTCGTTGGAGGGGCTGGTGTCGAT 
GCCCCCATGGTTAGATGCTACCGAAAATGATTTTAGGTATGGAGATGATTCGGTACTGTT 
GGACCCATGTCTCAAAGAAAGCTTTTTGTGGAATTATGAGTAAGGTTTTTTTTTGGAAAG 
AAATGTGGTTTTTTGTTTCCTCCTCTCTTTTATACT^ 

ATATCTTCTACATATGTAATACTTTTCGATTAGTAAACAATGATTCGGTTTCGGGTAGAA 
AAAAAAAAAAAAAAAAAAAAAAA 

>G1007 Amino Acid Sequence (domain in AA coordinates: TBD) 
MVDSHGSDTECSSKKKKEKTKEKGVYRGARMRSW 

AARAHDVAALSIKGSSAJLNFPEI^FLPRPVSLSQQDIQAAAAEAALMDFKTVPFHLQD 
DSTPLQTRCI^KIEKCTSSSSSSASSSSSSSSSSSSSMLSGELG^^ 
ALYDSLEGLVSMPPWLDATENDFRYGDDS VLLDPCLKES FLWNYE * 
>G1010 (344.. 1276) 



AAAAGAGAGAGAGCTATGTAGCTATGAAACAGTAAGAGATATAGATATAGAGAGACAGAG 

AAAGATGATGATCAGTGAAGTTAGGCTAAACCCACTTTCTATTTATGTATAATTAGGTCA 

ATC^CATCACCAATCTCCTCCTCCAATTCTCCTCCTCTCCTTCCAAATTCTAGGGT^ 

CTTGTATCTCACCCCCTTTCTCAATTCCCTAGGGAAACTGTGAATTTCATCAAATTCCAT 

TATTTTTTGGTCACACCCTTAAAGAGATCTGAGAGTTCTAAAGATGATGACAGATTTATC 

TCTCACGAGAGATGAAGATGAAGAAGAAGCAAAGCCCTTAGCAGAAGAAGAAGGAGCGCG 

TGAAGTAGCAGACAGAGAGCACATGTTCGACAAAGTTGTGACTCCAAGTGATGTCGGAAA 

ACTAAACCGACTTGTGATCCCAAAGCAACACGCAGAGAGATTCTTCCCTTTAGATTCATC 

TTCAAACGAGAAAGGTTTGCTTTTAAACTTCGAAGATCTCACTGGCAAATCTTGGAGGTT 

CCGTTACTCTTACTGGAACAGTAGTCAAAGCTATGTCATGACTAAAGGTTGGAGC^GATT 

CGTTAAAGACAAAAAGCTTGACGCCGGAGATATTGTCTCTTTCCAAAGATGTGTCGGAGA 

TTCAGGAAGAGATAGCCGTTTGTTTATTGATTGGAGGAGAAGACCTAAAGTCCCTGACCA 

TCCTCATTTCGCCGCCGGAGCTATGTTCCCTAGGTTTTACAGCTTTCCTTCGACCAATTA 

CAGTCTTTATAATCATCAGCAGCAACGTCATCATCACAGTGGTGGTGGTTATAATTATCA 

TCAAATTCCGAGAGAATTTGGTTATGGTTACTTCGTTAGGTCAGTGGATCAGAGGAACAA 

TCCTGCGGCTGCGGTGGCTGATCCGTTGGTGATTGAATCTGTGCCGGTGATGATGCACGG 

GAGAGCTAATCAGGAACTTGTTGGAACGGCCGGGAAGAGACTGAGGCTTTTTGGAGTTGA 

TATGGAATGCGGCGAGAGCGGAATGACCAACAGTACGGAGGAGGAATCATCATCTTCCGG 

TGGAAGTTTGCCACGTGGAGGCGGTGGTGGTGCTTGATCTTCCTCTTTCTTTCAGCTGAG 

ACTTGGAAGCAGCAGTGAAGATGATCACTTCACTAAGAAAGGAAAGTCTTCATTGTCTTT 

TGATTTGGATCAATAATAATGATGATGATGAAATTAGTTGGTATTTTAAGAAAAAAAACA 

TACATATATAATTCTATATATATGACAACATAATGCATTGATTTCCTT 

>G1010 Amino Acid Sequence (domain in AA coordinates: 33-122) 

MMTDLSLTRDEDEEEAKPLAEEEGAREVADREHM^ 

FPLDSSSNEKGLLLNFEDLTGKSWRFRYSYmSSQSYVMTKGWSRFVKDKKLDAGDIVSF 
QRC^GDSGRDSRIiFIDVTORRPKVPDHPHFAAGAMFPRFYSFPST^ 
GGYNYHQIPREFGYGYFVRSVDQRNNPAAAVADPLVIESVPVMMHGRANQELVGTAGKRL 
RLFGVDMECGESGMTNSTEEESSSSGGSLPRGGGGGASSSSFFQLRLGSSSEDDHFTKKG 

KSSLSFDLDQ* 
>G1014 (174.. 1112) 

CACAAACCACAGTCTCTCTTTCTCTCTCTATCTATCTTCTCTTTCTCTCTCTATCTCTAT 
CACTGAAACCCAAAGAGATCCACCATTTGTTCTTTTTTCCTTCACACAGAGAACTGTTTT 
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CTTCCACACTTCCTTTTTACTAGG 

ATGAAAATGTGGAAACCAAGGCCTCTACTTTAGTGGCAAGTC 

CCGGGTCGGGTCATGATCATCATGGGTTATCGGCGTCTGTGCCTCTTCTTGGTGTTAACT 

GGAAGAAGAGAAGGATGCCTAGACAGAGACGATCTTCTTCITCCTTTAACCTTCTCT 

TCCCTCCTCCTATGCCTCCTATTTCCC^CGTGCCAACTCCTCTCCCCGCACGTAAAATTC 

ACCCAAGAAAGCTAAGATTCCTCTTCCAAAAGGAACTCAAGAACAGTGACGTCAGCTCTC 

TCCGACGTATGATACTCCCGAAGAAAGCCGCGGAGGCTCACTTGCCGGCACTTGAATGCA 

AGGAAGGGATTCCTATAAGAATGGAAGATTTGGACGGTTTTCACGTTTGGACCTTCAAGT 

ATAGGTACTGGCCAAACAACAATAGCAGAATGTACGTGCTAGAAAACACAGGCGATTTTG 

TGAATGCTCATGGTCTGCAGCTAGGTGACTTCATCATGGTTTACCAAGATCTCTACTCAA 

ACAATTACGTTATACAAGCAAGAAAAGCATCGGAAGAAGAAGAAGTAGACGTAATCAATC 

TTGAAGAAGACGACGTTTACACAAACTTAACAAGGATCGAAAACACTGTGGTTAACGATC 

TTCTCCTCCAAGATTTTAATCATGACAACAACAACAACAACAACAAC^ 

GCAACAAATGTTCTTACTATTATCGAGTCATAGATGATGTCACC^CAAACACAGAGTCTT 

TTGTCTACGACACGACGGCTCTTACCTCCAACGATACTCCTCTCGATTTTTTGGGTGGAC 

ATACGACGACTACTAATAATTATTACTCCAAGTTCGGAACATTCGATGGTTTGGGCTCCG 

TTGAGAATATCTCTCTCGATGACTTCTACTAGATAATCAATCGATGGGCTCATGGTATTC 

TTGATGGTGATCAGCTATTTAATATCCTTATAATATATATAAGAATTAAATGCAATTTGC 

ATATATATTATCAAGTGTTGTAATATAACATTACAGTTTAAAAAAAAAAAAAAAAAA 

>G1014 Amino Acid Sequence (domain in AA coordinates: 90-172) 

MVDENVETKASTLVAS VDHGFGSGSGHDHHGLSAS VPLLGVNWKKRRMP S S S FNL 

LSPPPPMPPISHVPTPLPARKIDPRKLRFLFQKELKNSDVSSLRRMILPKKAAEAHLPAL 

ECKEGIPIRMBDLDGFHVWTFKYRYWPNNNSRMYV^ 

YSNNYVIQARKASEEEEVDVINLEEDDVYTNLTRIENT^^ 

SNSNKCSYYYPVIDDVTTNTESFVYDTTALTSOT^ 

GSVENISLDDFY* 

>G1035 (103.. 624) 

CCATAATAATATATTAAAACTATATACTATAATCTTTTTACATAATAAACTTTGGGTCCT 
GCGTCTTAATCATAGTACTTAATTTTCTCTGTGTGTTTTAATATGAATAATAAAACTGAA 
ATGGGATCTTCCACAAGTGGAAATTGCTCGTTC 

GGTTCAGAATCTGATCTCCGGCAACGTGATCTAATCGACGAGCGGAAGAGAAAGAGGAAA 
CAGTCGAACAGAGAATCTGCGAGGAGGTCGAGGATGAGGAAGCAGAAGCATTTGGATGAT 
CTCACTGCTCAGGTGACTCATCTACGTAAAGAAAACGCTCAGATCGTCGCCGGAATCGCC 
GTCACGACGCAGCACTACGTCACTATCGAGGCGGAGAACGACATTCTCAGAGCTCAGGTT 
CTTGAACTTAACCACCGTCTCCT^TCTCTTAACGAGATCGTTGATTTCGTCGAATCTTCT 
TCTTCAGGATTCGGTATGGAGACCGGTCAGGGATTATTCGACGGTGGATTATTCGACGGC 
GTGATGAATCCTATGAATCTAGGGTTTTATAATCAACCAATCATGGCTTCTGCTTCTACT 
GCTGGTGATGTTTTCAACTGTTAGAAAACTTCACATCATTATCATCGTGAGTGAGACTAA 
TCATCGCAGCAGGGGTAAAACTGTAATTTTTCTTATAAATTATGTGATGATGCTTTGTTT 
CTTTATTTTATAAGATGGTTAATTAGTGTTTAAAACrGATTGTAATGATAGAC^GTGTAA 
GAAATGTGTGATATCATGGAGATGGTGATGTGAGTTTGGTACAAATATTTTAAGATCTTT 
TCTTTCTATATATTAAAAGTGAAGA^TAATATTTTGTCATTTTCTTAAAAAAAAAAAAA 
AAA 

>G1035 Amino Acid Sequence (domain in AA coordinates: 39-91) 

mnnktemgsstsgncssvsttgiians 

qkhlddltaqvthlrkenaqivagiavttqh^ 

df\^ssssgfgi^tgqglfdgglfdgv™pmnlgfywqpimasastagdvfnc* 

>G1046 (1..567^- 

ATGATTAGACATCTAAAACCCTACATGGAGTCGTCTAGTGTCCATCGCTCTCATTGTTTC 
GATATTCTTGATGGAGTCCCACTACACGACGATCATTTCAACTCGGCATTCCTACCAAAC 
ACTGACTTTAATGTTCATTTGCAGTCAAACGTATCGACCCGCATCAACAATCAGTCTCAC 
TTAGACCCAAATGCAGAAAACATTTTCC^TAAC 

GCAAGAAGAATGGTCTCTAACCGGGAATCTGCAAGGAGGTCACGTATGCGCAAAAAGAAG 
CAGATCGAAGAGCTGCAACAACAAGTTGAAC7UVCTCATGATGTTGAATCATCACTTGTCT 
GAGAAAGTCATCAACTTGTTGGAAAGCAACCATCAGATCCTACAAGAGAACTCACAGCTG 
AAAGAGAAAGTCTCTTCCTTTCACTTGCTCATGGCAGATGTGCTATTACCCATGAGAAAT 
GCAGAGAGCAACATCAATGACCGCAATGTGAATTATCTAAGAGGAGAACCATCAAACCGT 
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CCCACCAACAGTCCCTTTGGTAAGTAA 

>G1046 Amino Acid Sequence (conserved domain in AA coordinates : 79-138) 
MIRHLKPYMESSSVHRSHCFDILDGVPLHDDHFNSAFLPNTD 

LDPNAENI FHNEGLAPEERRARRMVSNRE S ARRSRMRKKKQI EELQQQVEQLMMLNHHLS 

EKVINLLESNHQILQENSQLKEKVSSFHLL^ 

PTNSPFGK* 

>G1049 (29.. 550) 

CTAACTTTCTTCCCAAGTAAACTTCAA 

TAACTACCTAAACTCATCGATACTGCAGTCTCCGTATCCTTCTAATTTCCCGATATCTAC 

GCCATTTCCAACCAACGGTCAAAACCCGTACCTCCTCTACGGATTCCAAAGCCCTACAAA 

CAATCCACAATCCATGAGCCTAAGCAGCAACAACTCAACATCAGATGAAGCAGAAGAGCA 

GC^GACGAACAACAATATAATCAACGAGCGGAAGCAGAGAAGGATGATTTCAAACCGAGA 

ATCCGCAAGGAGATCGCGTATGAGGAAGCAAAGACACCTTGACGAGC 

GATGTGGTTAAGGATCGAGAATCATCAGTTGCTTGATAAGCTTAACAATCTCTCTGAGTC 

TCACGACAAGGTTCTTCAAGAGAATGCTCAGCTTAAAGAAGAAACATTTGAGCTTAAGCA 

AGTGATCAGCGATATGCAAATTCAAAGCCCTTTCTCTTGCTTTAGAGACGATATAATCCC 

CATTGAATAAAGCATTTTTCCCCGATTCATATTTATGAAAATTTTCOT 

TCTTTGTATGTATATGTGGAGATGTATTTCAGGGTTTTGATAATATGACCCTTTACGACG 

ACGTTTTTAGATTGTAGTAAATTTATAAACTAAAGAAGATTAGTGTTAATGAAGAACAAA 

TATAA 

>G1049 Amino Acid Sequence (domain in AA coordinates 77-132) 
MQPQTDVFSLHNYLNSSILQSPYPSNFPISTPFPTNGQNPYLLYGFQSPTNNPQSMSLSS 

NNSTSDEAEEQQTNNNIINERKQRRMISNRE^ 

LLDKLNNLSESHDKVLQENAQLKEETFELKQVISDMQIQSPFSCFRDDIIPIE* 
>G1069 (89.. 934) 

TTGGAACCCTAGAGGCCTTTCAAGCAAATCATCAGGGTAACAATTTCTTGATCTTTCTTT 
TTAGCGAATTTCCAGTTTTTGGTCAATCATGGCAAACCCTTGGTGGACGAACCAGAGTGG 
TTTAGCGGGCATGGTGGACraTTCGGTCTCCT 

AAGTCTTCTTACCAAAGGAGATCTTGGAATAGCCATGAATCAGAGCCJVAGACAACGACCA 
AGACGAAGAAGATGATCCTAGAGAAGGAGCCGTTGAGGTGGTCAACCGTAGACCAAGAGG 
TAGACraCCAGGATCCAAAAAO^CCO^GCTC 

CAACGCACTCCGTAGCCATGTCTTGGAGATCTCCGACGGCAGTGACGTCGCCGACACAAT 
CGCTCACnTCTC^GACGCAGGCAACGCGGCGTTTGCGTTCTC^GCGGGACAGGCTaVGT 
CGCTAACGTCACCCTCCGCCAAGCCGCCGCACCAGGAGGTGTGGTCTCTCTCCAAGGCAG 
GTTTGAAATCTTATCTTTAACCGGTGCTTTCCTCCCTGGACCTTCCCCACCCGGGTCAAC 
CGGTTTAACGGTTTACTTAGCCGGGGTCCAGGGTCAGGTCGTTGGAGGTAGCGTTGTAGG 
CCCACTCTTAGCCATAGGGTCGGTC^TGGTGATTGCIX3CTACrTTCTCTAACGCTACTTA 
TGAGAGATTGCCCATGGAAGAAGAGGAAGACGGTGGCGGCTCAAGACAGATTCACGGAGG 
CGGTGACTCACCGCCCAGAATCGGTAGTAACCTGCCTGATCTATCAGGGATGGCCGGGCC 
AGGCTAC^TATGCCGCCGC^TCTGATTCCAAATGGGGCTGGTCAGCTAGGGCACGAACC 
ATATACATGGGTCCACGCAAGACCACCTTACTGACTCAGTGAGCC^TTTCTATATATAAT 
GGTCTATATAAATAAATATATAGATGAATATAAGCAAGCAATTTGAGGTAGTCTATTACA 
AAGCTTTTGCTCTGGTTGGAAAAATAAATAAGTATCAAAGCTTTGTTTGTTCTTAATGGA 
AATATAGAGCTTGGGAAGGTAGAAAGAGACGACATT 

>G1069 Amino Acid Sequence (domain in AA coordinates: 67-74) 

MANPWWTNQSGLAGMVDHSVSSGHHQNHHHQSLLTKGDLGIAMNQSQDNDQDEEDDPREG 

AVEVVNRRPRGRPPGSKNKPKAPIFVTRDSPNALRSHVLEISDGSDVADTI 

GVCVLSGTGSVANWLRQAAAPGGVVSLQGRFEILSLTGAFLPGPSPPGSTGLTVYLAGV 

QGQWGGSWGPLLAIGSVMVIAATFSNATYERLPMEEEEDGGGSRQIHGGGDSPPRIGS 

NLPDLSGMAGPGYNMPPHLI PNGAGQLGHEPYTWVHARPPY* 

>G1070 (170.. 1144) 

TCGACCAGCTTGGATTTCGTTGTTCATCATTACTACTCTCTTTCTTCTTCTAGCTAGCTA 
GTTTTGACAGCAAAATAAGAAGC^VAAAAAAAGGTCAACTAAAAAAGATCTGTTCTTAGAT 
CACTCTCTTCTTCTTTTTTTGATCCAATTCCACCATTGAATCATAGATCATGGATCCAGT 
ACAATCTCATGGATCACAAAGCTCTCTACCTC 

ACATCTTCAACAACAGCAACAAGAGTTCTTCCTCCACCATCACCAGCAACAAAGAAACCA 
AACCGATGGTGACCAACAAGGAGGATCAGGAGGAAACCGACAAATCAAGATGGATCGTGA 
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AGAGACAAGCGACAACATAGACAACATAGCTAACAACAGCGGTAGTGAAGGTAAAGACAT 

AGATATACACGGTGGTTCAGGAGAAGGAGGTGGTGGCTCCGGAGGAGATCATCAGATGAC 

AAGAAGACCAAGAGGAAGACCAGCGGGATCCAAGAACAAACCAAAACCACCGATTATCAT 

CACACGGGACAGCGCAAACGCGCTTAGAACCCACGTGATGGAGATCGGAGATGGCTGCGA 

CTTAGTCGAAAGCGTTGCCACTTTTGCACGAAGACGCCAACGCGGCGTTTGCGTTATGAG 

CGGTACTGGAAATGTTACTAACGTCACTATACGTCAGCCTGGATCTCATCCTTCTCCTGG 

CTCGGTAGTTAGTCTTCACGGAAGGTTCGAGATTCTATCTCTCTCAGGATCTTTTCTCCC 

TCCTCCGGCTCCTCCTACAGCCACCGGATTGAGTGTTTACCTCGCTGGAGGACAAGGACA 

GGTGGTTGGAGGAAGCGTAGTTGGTCCGTTGTTATGTGCTGGTCCTGTCGTTGTCATGGC 

TGCGTCTTTTAGCAATGCGGCGTACGAAAGGTTGCCTTTAGAGGAAGATGAGATGCAGAC 

GCCGGTTCATGGCGGAGGAGGAGGAGGAT(^TTGGAGTCGCCGCCAATGATGGGACAACA 

ACTGCAACATCAGCAACAAGCTATGTCAGGTCATCAAGGGTTACCACCTAATCTTCTTGG 

TTCGGTTCAGTTGCAGCAGCAACATGATCAGTCTTATTGGTCAACGGGACGACCACCGTA 

TTGATCAAATATACACACACACTCATAATCGTTGCTAGCTAGCTAACGATGAATCATGAG 

TTTAGTGGATATATATATGATTAAAAGAGGTTAGCTTATGAA . 

TTCTATCGAGCTTCATTATGTTTGGGTCATCGTTC 

>G1070 Amino Acid Sequence (domain in AA coordinates: 98-120) 

MDPVQSHGSQSSLPPPFHAraFQLHLQQQQQBFFLHHHQQQRNQTDGDQQGGSGGNRQIK 

MDRKETSDNIDNIANNSGSEGKDIDIHGGSGEGGGGSGGDHQMTRRPRGRPAGSKNKPKP 

PI I ITRDSANALRTHVI^IGDGCDLVESVATFARRRQRGVCVMSGTGN^ 

PSPGSWSLHGRFEILSLSGSFLPPPAPPTATGLSVYLAGGQGQWGGSWGPLLCAGPV 

VVMAASFSNAAYERLPLEEDEMQTPVHGGGGGGSLESPPMMGQQLQHQQQAMSGHQGLPP 

NLLGSVQLQQQHDQSYWSTGRPPY* 
>G1076 (198.. 1076) 

ATTTTAGTCTTCCTATAACTTCTTCTCAATCCTCTCTCA 

TTTCAATAAAATAGAAAAAAACATATACAAATCTAC^ 

CTTGTGTGTGTGTGTGTGTTTTATAT^^ 

TTGCTTTTGATGTGGGCATGGCTGGTCTTGATCTAGGCACAGCTTTTCGTTACGTTAATC 

ACCAGCTCCATCGTCCCGATCTCCACCTTCACCACAATTCCTCCTCCGATGACGTCACTC 

CCGGAGCCGGGATGGGTCATTTCACCGTCGACGACGAAGACAAO^CAACAACCATCAAG 

GTCTTGACTTAGCCTCTGGTGGAGGATCAGGAAGCTCTGGAGGAGGAGGAGGTCACGGCG 

GGGGAGGAGACGTCGTTGGTCGTCGTCCACGTGGCAGACCACCGGGATCCAAGAACAAAC 

CGAAACCTCCGGTAATTATCACGCGCGAGAGCGCAAACACTCTAAGAGCTCACATTCTTG 

AAGTAAC^^CGGCTGCGATGTTTTCGACTGCGTTGCGACTTATGCTCGTCGGAGACAGC 

GAGGGATCTGCGTTCTGAGCGGTAGCGGAACGGTCACGAACGTCAGCATACGTCAGCCAT 

CTGCGGCTGGAGCGGTTGTGACGCTACAAGGAACGTTCGAGATTCTTTCTCTCTCCGGAT 

CGTTTCTTCCTCCTCCGGCACCTCCCGGAGCAACGAGTTTGACAATTTTCTTAGCCGGA 

GACAAGGTCAGGTGGTTGGAGGAAGCGTTGTGGGTGAGCTTACGGCGGCTGGACCGGTGA 

TTGTGATTGCAGCTTCGTTTACTAATGTTGCTTATGAGAGACTTCCTTTAGAAGAAGATG 

AGCAGCAGCAACAGCTTGGAGGAGGATCTAACGGCGGAGGTAATTTGTTTCCGGAGGTGG 

CAGCTGGAGGAGGAGGAGGACTTCCGTTCTTTAATTTACCGATGAATATGCAACCAAATG 

TGCAACTTCCGGTGGAAGGTTGGCCGGGGAATTCCGGTGGAAGAGGTCCTTTCTGATGTG 

TATATATTGATAATCATTATATATATACCGGCGGAGAAGCTTTTCCGGCGAAGAATTTGC 

GAGAGTGAAGAAAGGTTAGAAAAGCTTTTAATGGACTAATGAATTTCAAATTATCATCGT 

GATTTCGGACATTGTCTTGTTCATCATGTTAAGCTTAGGTTTATTTTTTGTCGTTTGTAG 

AATTTTATGTTTGAATCCTTTTTTTTTTCTGTGAAACTCTATTGTGTTCGTCTGCGAAGG 

AAAAAAAAATTCTCAAAAAAAA 

>G1076 Amino Aeid Sequence (domain in AA coordinates: 82-89) 

MAGIiDIiGTAFRYVNHQLHRPDLHLHHNS S SDD VTPGAGMGHFTVDDEDNNNNHQGLDLAS 

GGGSGSSGGGGGHGGGGDWGRRPRGRPPGSKNKPKPPVIITRESANTLRAHILEVTNGC 

DVFDCVATYARRRQRGICVLSGSGTVT3^SIRQPSAAGAVVTLQGTFEILSLSGSFLPPP 

APPGATSLTIFLAGGQGQWGGSWGELTAAGPVIVIAASFTNVAYERLPLEEDEQQQQL 

GGGSNGGGNLFPEVAAGGGGGLPFFNLPMNMQPNVQLPVEGWPGNSGGRGPF* 

>G1089 (31.. 2427) 

AAGTAAGAGAGCTTCTTAAGGAAGAAGAAGATGGGTTGTGCTCAATCAAAGATCGAGAAC 
GAAGAAGCAGTTACTCGTTGCAAAGAACGAAAACAATTGATGAAAGACGCCGTCACTGCT 
CGTAACGCTTTCGCCGCCGCTCACTCAGCTTACGCTATGGCTCTTAAAAACACCGGAGCT 
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GCTCTTTCCGATTACTCTCACGGCGAGTTTTTAGTCTCTAATCACrCG 

GCTGCAGCAATCGCTTCTACTTCTTCTCTTC^ ■ 

TCCACCGCTCCGGTTTCTAATTCAACCGCTTCTTCTTCCTCCGCTGCGGTTCCTCAGCCG 

ATTCCTGATACTCTTCCTCCTCCTCCTCCTCCACCZACCGCTTCCrCTTCAACGTGCTGCT 

ACTATGCCGGAGATGAACGGTAGATCCGGTGGTGGTCATGCTGGTAGTGGACTCAACGGA 

ATTGAAGAAGATGGAGCCCTAGATAACGATGATGATGACGATGATGATGATGATGACTCT 

GAAATGGAGAATCGTGATCGTTTGATTAGGAAATCGAGAAGCCGTGGAGGTAGTACTAGA 

GGAAATAGGACGACGATTGAAGATCATCATCTTCAGGAGGAGAAAGCTCCGCCACCTCCC 

CCTTTGGCGAATTCGCGGCCAATTCCGCCGCCACGTCAGCATCAGCATCAACATCAGCAA 

CAGGAACAACAACCTTTCTACGATTACTTCTTCCCTAATGTTGAGAATATGCCTGGAACT 

ACTTTAGAAGATACTCCTCCACAACCACAACCACAACCAACAAGGCCTGTGCCTCCTCAA 

CCACATTCACCAGTCGTTACTGAGGATGACGAAGATGAGGAGGAGGAAGAGGAGGAAGAG 

GAGGAGGAAGAGGAGACGGTGATTGAACGGAAACCACTGGTGGAGGAAAGACCGAAGAGA 

GTAGAGGAAGTGACGATTGAATTGGAAAAAGTTACTAATTTGAGAGGGATGAAGAAGAGT 

AAAGGGATAGGGATTCCCGGAGAGAGGAGAGGAATGCGAATGCCGGTGACTGCGACGCAT 

TTGGCGAATGTATTCATTGAGCTTGATGATAATTTCTTGAAAGCTTCTGAAAGTGCTCAT 

GATGTTTCTAAGATGCTTGAAGCTACTAGGCTCCATTACCATTCTAATTTTGCAGATAAC 

CGAGGACATATTGATCACTCTGCTAGAGTGATGCGTGTAATTACATGGAATAGATCATTT 

AGAGGAATACCAAATGCTGATGATGGGAAAGATGATGTTGATTTGGAAGAGAATGAAACT 

CATGCTACTGTTCTTGACAT^ATTGCTAGCATGGGAAAAGAAGCTCTATGACGAAGTCAAG 

GCTGGCGAACTCATGAAAATCGAGTACCAGAAAAAGGTTGCTCATTTAAATCGGGTGAAG 

AAACGAGGTGGCC^CTCGGATTCATTAGAGAGAGCTAAAGCTIGCAGTAAGTCATTTGCAT 

ACAAGATATATAGTTGATATGCAATCCATGGACTCCACAGTTTCAGAAATCAATCGTCTT 

AGGGATGAACAACTATACCTAAAGCTCGTTCACCTTGTTGAGGCGATGGGGAAGATGTGG 

GAAATGATGCAAATACATCATCAAAGACAAGCTGAGATCTCAAAGGTGTTGAGATCTCTA 

GATGTTTCACAAGCGGTGAAAGAAACAAATGATCATCATCACGAACGCACCATCCAGCTC 

TTGGCAGTGGTTCAAGAATGGCACACGCAGTTTTGCAGGATGATAGATCATCAGAAAGAA 

TACATAAAAGCACTTGGCGGATGGCTAAAGCTAAATCTCATCCCTATCGAAAGCACACTC 

AAGGAGAAAGTATCTTCGCCTCCTCGAGTTCCCAATCCCGCAATCCAAAAACTCCTCCAC 

GCTTGGTATGACCGTTTAGACAAAATCCCCGACGAAATGGCTAAAAGTGCCATAATCAAT 

TTCGCAGCGGTTGTAAGCACGATAATGCAGCAGCAAGAAGACGAGATAAGTCTCAGAAAC 

AAATGCGAAGAGACAAGAAAAGAATTGGGAAGAAAAATTAGACAGTTTGAGGATTGGTAC 

CACAAATACATCCAGAAGAGAGGACCGGAGGGGATGAATCCGGATGAAGCGGATAACGAT 

CATAATGATGAGGTCGCTGTGAGGCAATTCAATGTAGAACAAATTAAGAAGAGGTTGGAA 

GAAGAAGAAGAAGCTTACCATAGACAAAGCCATCAAGTTAGAGAGAAGTCACTGGCTAGT 

CTTCGAACTCGCCTCCCCGAGCTTTTTCAGGCAATGTCCGAGGTTGCGTATTCATGTTCG 

GATATGTATAGAGCTATAACGTATGCGAGTAAGCGGCAAAGCCAAAGCGAACGGCATCAG 

AAACCTAGCCAGGGACAGAGTTCGTAAGAACTAATGTAAGATCAGAGTAATGTCTTCTTC 

TTCTTTGATCTTGAATATTTAAGCACACACATACATACAACGTATAGCTAAATCTTTATC 

ATTGCTTTCTTATATTAAGGTTTTGGCTTTTGTAAGAAGGTTTCTTACATATGAGATTCA 

TATAGTGTTTGATTCTTAAGGAACTGTTCTGTTGAGTAATAAGAAAGTTGTGTATTGAAA 

TAGAGTTGCATTTGTTAATTTTG 

>G1089 Amino Acid Sequence (domain in AA coordinates 425-500) 
MGCAQSKIEISnSEAVTRCKERKQLMKBAVTA 

LVSNHSSSSAAAAIASTSSLPTAISPPLPSSTAPVSNSTASSSSAAVPQPIPDTLPPPPP 

PPPLPLQRAATMPEI^GRSGGGHAGSGljNGIEEDGALDiroDDDDDDDDDSEMENIUDRLIR 

KSRSRGGSTRGNRTTIEDHHLQEEKAPPPPPLANSRPIPPPRQHQHQHQQQQQQPFYDYF 

FPNVEimPGTTLED^PPQPQPQPTRPVPPQPHSPVVTEDDEDEEEEEEEEEEEEETVIER 

KPLVEERPKRVEEVTlELEKNnTNLRGMKKSKGIGIPGERRGMRMPOT 

NFLKASESAHDVSKMBEATRmYHSNFADNRGHIDHSARVMRVITWNRSFRGIPNADDGK 

DDVDLEENETHATVLDKLLAWEKKLYDEVK^ 

RAKAAVSHLHTRYIVDMQSr^STVSEINRLRDEQLYLKLVHLVEAMGKMW 

AEI S KVLRSLDVSQAVKETNDHHHERT I QLLAWQEWHTQFCRM IDHQKE YI KALGGWLK 

LNLIPIESTLKEKVSSPPRVPNPAIQKLLHAVraDRLDKIPDEMAKSAIINFAAW 

QQEDEISLRNKCEETRKELGRKIRQFEDWYHKYIQKRGPEGMNPDEADNDHNDEVAVRQF 

NVEQIKKRLEEEEEAYHRQSHQVREKSLASLRTRLPELFQAMSEVAYSCSDMYRAITYAS 

KRQSQSERHQKPSQGQSS* 
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>G1093 (1..531) 

ATGGGTTATCCGGTGGGGTACACTGAGCTCCTCCTCCC AAGA ATCTTCCTTCACTTACTC 

TCTCTCTTAGGCTTAATACGAACACTCATAGACACGGGTTTTCGGATATTGGGTCTACCC 

GACTTTCTCGAATCCGACCCGGTTTCATCGTCATCGTCATGGCTGGAACCACCGTATATG 

TCCACGGCGGCGCATCATCACCAAGAAAGCTC^TTTTTCTTCCCAGTGGCGGCGAGGCTA 

GCTGGAGAAATCTTGCCCGTCATCAGATTCTCGGAGCTAACTCGACCCGGATTCGGATCC 

GGATCCGATTGCTGCGCGGTGTGCCTCCACGAGTTCGAGAACGATGACGAGATCCGACGG 

CTGACGAATTGTCAACACATATTTCACCGGAGCTGTTTAGACCGTTGGATGATGGGTTAT 

AATCAGATGACGTGTCCACTTTGTAGAACGCCGTTTATTTCTGATGAGTTACAAGTTGCT 

TTTAACCAACGAGTTTGGTCTGAATCTGAACTTCTCGCAGAATCAAATTAG 

>G1093 Amino Acid Sequence (domain in AA coordinates: 105-148) 

MGYPVGYTELLLPRIFLHLLSLIiGLIRTLIDTGFRILGLPDFLESDPVSSSSSWLEPPYM 

STAAHHHQESSFFFPVAARLAGEILPVIRFSELTRPGFGSGSDCCAVCLHEFENDDEIRR 

LTNCQHIFHRS(^DRWMMGYNQMTCPLCRTPFISDELQVAFNQRVWSESELLAESN* 

>G1127 (191.. 1351) 

GACAGACTCTCTCTGTATGTGTGCGAGAAGCGAGAAGCGAGAGAGAGAGAGAGAGAGTTG 
TTAGCTCACACGCTTTCTCTATTTTCTCGGAATTCACAAAACAGAAAGTTTCATCCTTTA 
CGAGAATTAAGCCGAAAGAAACAATCTTTGAGTTTGATTTCTTCTTCCTTCCTTCTCTCT 
CTCTGCTCTAATGGATTCCAGAGACATCCCACCGTCACATAACCAGCTTCAACCACCACC 
GGGAATGTTAATGTCTCATTACCGTAACCCTAACGCCGCCGCTTCACCATTAATGGTTCC 
CACTTCCACATCTCAACCGATTCAACACCCTCGTCTTCCTTTTGGCAATCAAC7UVCAATC 
TCAAACGTTTCATCAGCAGCAACAACAACAAATGGATCAGAAGACTCTTGAATOTCTTGG 
ATTTGGTGATGGATCACCrrCTTCTCAACCGATGCGATTCGGGATCGATGATCAGAATCA 
GCAACTGCAAGTGAAGAAGAAGCGAGGAAGGCCGAGAAAGTATACTCCTGATGGTAGCAT 
TGCTTTAGGTTTAGCTCCTACGTCTCCTCTTCTCTCTGC^GCTTCTAATTCTTACGGTGA 
GGGTGGTGTTGGAGATAGTGGTGGAAATGGAAACTCTGTTGATCCACCTGTTAAACGTAA 
CAGAGGAAGGCCTCCTGGTTCTAGTAAGAAACAGCTTGATGCTTTAGGAGGAACTTCAGG 
AGTTGGGTTTACACCTCATGTCATTGAAGTGAACACAGGAGAGGACATAGCGTCAAAGGT 
GATGGCTTTTTCGGATCAAGGGTCAAGAACAATTTGTATTCTCTCTGCAAGTGGTGCAGT 
TTCTAGAGTGATGCTTCGTCAAGCTTCTCATTCTAGTGGAATCGTTACTTATGAGGGACG 
ATTTGAGATCATTACTCTCTCAGGCTCAGTCTTGAATTATGAGGTAAATGGTTCCACCAA 
CAGAAGTGGTAACTTGAGTGTGGCTTTGGCTGGACCTGATGGCGGCATCGTAGGTGGCAG 
TGTAGTTGGTAATCTAGTAGCTGCAACAGAAGTCCAGGTGATAGTGGGAAGCTTTGTTGC 
AGAAGCAAAGAAACCGAAACAAAGTAGTGTTAACATTGCTCGGGGGCAGAATCCTGAACC 
GGCTTCAGCGCCGGCTAACATGTTGAACTTTGGATCAGTCTCTCAAGGACCATCGAGCGA 
GTCATCAGAAGAGAATGAGAGCGGTTCTCCTGCAATGCACCGTGACAATAATAATGGGAT 
ATATGGAGCTCAACAACAACAACAACAACAACCTCTTCATCCTCATCAGATGCAAATGTA 

TTGGTTACGGTTATGGTTTGATTTCTT 

>G1127 Amino Acid Sequence (domain in AA coordinates : 103 -110, 155- 

I^SRDIPPSHNQLQPPPGMLMSHYRNPNAAASPLMVPTSTSQPIQHPRLPFGNQQQSQTF 

HQQQQQQMDQKTLESLGFGDGSPSSQPMRFGIDI^QNQQLQVKKKRGRPRKYTPDGSIAIjG 

LAPTSPLLSAASNSYGEGGVGDSGGNGNSVDPPVKRNRGRPPGSSKKQLDALGGTSGVGF 

TPHVIEVin'GEDIASKVMAFSDQGSRTICILSASGAVSRVMLRQASHSSGIVTYEGRFEI 

ITLSGSVLNYEVNGSTNRSGNLSVALAGPDGGIVGGSVVGNLVAATQVQVIVGSFVAEAK 

KPKQSSVNIARGQNPEPASAPANMIiNFGSVSQGPSSESSEENESGSPAMHRDNNNGIYGA 

QQQQQQQPLHPHQMQMYQHLWSNHGQ* 
>G1131 (57..7S8) 

TCGACTCCTCTCCTGATTGCTTCACCTTCTTCTTTACTACAGGTTTCAGCTCCTCAATGT 
CCATGGATTGCTTAAGCTACTTCTTTAACTACGATCCTCCTGTCCAGCTCCAGGATTGCT 
TTATTCCCGAGATGGATATGATTATCCCTGAAACCGATAGTTTCTTCTTCCAATCTCAAC 
CGCAACTGGAGTTTCATCAGCCATTGTTTCAAGAAGAAGCTCCTTCACAGACCCACTTTG 
ACCCTTTCTGCGACCAGTTTCTTTCTCCGCAAGAAATCTTTCTCCCTAACCCTAAAAACG 
AAATCTTCAACGAAACACACGACCTCGATTTCTTTCTCCCCACGCCAAAACGCCAGAGAC 
TTGTTAACTCCAGCTACAATTGTAACACTCAAAACCATTTCCAGAGCCGTAACCCGAATT 
TCTTCGACCCTTTCGGCGAOVCTGATTTCGTTCCAGAATCTTGTACCTTCCAGGAGTTTC 
GAGTTCCGGATTTCTCTTTAGCTTTCAAGGTAGGCCGGGGAGATCAAGATGACTCAAAGA 
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AACCGACGCTTTCATCTCAGAGCATCGCGGCTAGAGGGAGGAGAAGAAGAATTGCAiSAGA 

AGACTCACGAGCTCGGAAAACTCATCCCCGGTGGCAATAAACTTAACACCGCCGAGATGT 

TCCAAGCCGCCGCTAAGTATGTCAAGTTTTTGCAGAGTCAAGTTGGGATTCTCCAACTC 

TGCAGACCACAAAGAAGGTAATAACCAACCCCAAATAAGAACTTT^ 

CTCTAATCGTGTTTTCTCACAAGCTTCraAATTTGTTTACGCAGGGTAGCTCTAATGTGC 

AAATGGAAACTCAGTATTTGCTTGAATCGCAAGCAATCCAGGAGAAGTTATCAACAGAGG 

AAGTGTGTTTGGTACCGTGTGAAATGGTTCAAGATCTAACAACTGAAGAAACCATTTC 

GAACCCCGAATATTTCTCGAGAAATCAACAAGTTACTGTCTAAACATCTGGCTAACTAGT 

TTTAGTTTCAAGCCTGAAGTTCTCTATGC^ 

TTCTTAGTTAGTGTTTTGTCTTGTTGATTTAGGGGCTAATTATCCTGGTTAATCTCCTCT 
TAACTGGGAA 

>G1131 Amino Acid Sequence (domain in AA coordinates: 173-220) 
MSMDCIiS YFFNYDPPVQLQDCFI PEMDMI I PETDS FFFQSQPQLEFHQPLFQEEAPSQTH 
FDPFCDQFLSPQEIFLPNPKNEIFNETHDLDFFLPTPKR^^ 

NFFDPFGDTDFVPESCTFQEFRVPDFSLAFKVGRGDQDDSKKPTLSSQSIAARGRRRRIA 
EKTHELGKLI PGGNKLNTAEMFQAAAKYVKFLQSQVG I LQLMQTTKKVITNPK* 
>G1145 (243.. 1142) 

GTGATTTCTCTCTGCCATTTCCTTCGATTTGATTTCTGGGTTCTCTTCTTCTCGTCTCTC 

TTCTGCATGTTTCGCCACTCTACCTTAGAAAAAAGGTTACTTTCGCCTCCGATTTAGGCT 

CGATTTGATGAATTCGTCGTCGTGTGGCTATTTATCAAATTGAGC^TTAGGGTTTCTGAT 

TTGTGGGTTCAGAATTGTTTTTATCTATCTGTCTTGTTGTlT , ri u rGTCCGCTACAAAAGC 

CTATGGATTCTCAGAGGGGTATTGTTGAACAAGCTAAATCTCAGTCCTTGAATAGGCAAA 

GCTCTCTTTACAGCTTAACACTTGATGAGGTTCAAAATCACTTGGGGAGTTCTGGTAAAG 

CTCTGGGAAGCATGAACCTTGATGAGCTTTTGAAGAGTGTCTGTTCTGTTGAAGCTAATC 

AGCCATCGTCTATGGCTGTCAATGGTGGAGCAGCTGCTCAGGAGGGTCTTTCTCGCCAGG 

GGAGTTTGACTTTGCCTCGGGATCT(^GCAAAAAGACTGTTGATGAGGTTT^ 

TTCAGCAGAATAAGAATGGAGGTAGTGCTCATGAGAGGAGGGATAAGCAGCCTACACTTG 

GGGAAATCACGCTTGAAGACCTGTTGTTGAAAGCAGGAGTGGTCACTGAGACTATCCCTG 

GTTCGAACCATGATGGTCCTGTTGGTGGTGGTAGTGCTGGTTCAGGTGCTGGTTTAGGGC 

AAAACATTACTCAAGTTGGCCCATGGATTCAATA 

CTCAAGCATTTATGCCCTATCCGGTTTCAGATATGCAAGCAATGGTGTCTCAGTCTTCTT 

TGATGGGTGGTTTGTCAGATACACAAACTCCTGGAAGGAAGAGGGTAGCTTCAGGAGAAG 

TTGTAGAGAAGACTGTAGAGAGGAGGCAGAAGAGAATGATAAAGAACAGAGAGTCTGCTG 

CTCGTTCCCGAGCTAGGAAACAGGCTTACACTCATGAGCTAGAGATCAAAGTTTCACGGT 

TAGAAGAAGAAAACGAAAGACTCAGGAAGCAAAAGGAGGTGGAAAAATCCTCCCAAGTGT 

ACCACCGCCTGATCCCAAGCGGCAGCTCCGACGGACAAGCTCGGCTCCTTTCTGATCTCT 

AAACTCTTTTTGTCTTTTTCTTTTTTTCTCTTCTG 

GGAAAACAGCTTTGTTTCTTTGTACATTCCGTAGACT 

TAACTTTAAAATATTCTCGAGTTATTGTAGTAGCAGACTAGCAGCAGTAATGGTTTTCAT 
GAGTCCGATTGAAATTCAGAGATTGAACAGGAAAAAA 

>G1145 Amino Acid Sequence (conserved .domain in AA coordinates : 227-270) 
MDSQRGIVEQAKSQSLNRQSSLYSLTIJ^EVQNH^ 

PSSMAWGGAT^QEGLSRQGSLTLPRDLSKJCTVDEWKDIQQNK^GGSAHERRDKQPTLG 
EMTLEDLLLKAGWTETIPGSNHDGPVGGGSAGSGAGLGQNITQVGPWIQYHQLPSMPQP 
QAFMP YPVSDMQAMVSQS S LMGGLSDTQTPGRKRVASGEVVEKTVERRQKRM I KNRES AA 
RSRARKQAYTHELEIKVSRLEEENERLRKQKEVEKSSQVYHRLIPSGSSDGQARLLSDL* 
>G1229 (123.. 1217) 

TTTGGGCGGGTCTTTCTTTCCCTAAATCTTTCTTTTATTTTGCTGTTTAAAAAAAAAATC 
CAACCATAAGACAAAACAACGAACGAGGAAGAGAGAGAGAGAAGGATATATCTCTAATCA 
CGATGCAGGAGATAATACCGGATTTTCTTGAAGAGTGTGAATTTGTCGACACTTCACTAG 
CCGGAGATGATCTATTTGCCATCTTAGAGAGTCTTGAAGGTGCCGGAGAGATATCTCCGA 
CAGCTGC^TCTA(^CCTAAAGATGGAACCACAAGTTCCAAGGAGTTAGTTAAGGATCAAG 
ATTATGAAT^ACTCATCTCCTAAGAGGAAAAAGCAAAGACTAGAAACCAGGAAAGAAGAGG 
ACGAAGAAGAAGAAGACGGAGACGGAGAAGCAGAAGAAGATAATAAGCAAGATGGGCAAC 
AAAAGATGTCTCATGTAACCGTGGAACGTAACCGGAGAAAGCAAATGAACGAGCACTTAA 
CCGTTTTGCGTTCTCTTATGCCTTGTTTCTACGTCAAACGGGGGGACCAAGCATCGATCA 
TAGGAGGAGTTGTGGAGTACATAAGCGAGTTACAACAAGTTCTCCAATCTTTGGAAGCCA 
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AGAAACAACGTAAAACCTACGCCGAAGTCCTAAGCCCGAGAGTTGTCCCGAGCCCTCGTC 

CTTCACCGCCTGTTCTAAGCCCAAGAAAACCGCCT 

AGATTCACCACCACCTACTTCTCCCTCCCATAAGTCCTC 

CATACCGGGCCATTCCACCGCAACTACCACTCATCCCACAGCCTCCGCTTCGCTCTTACA 

GCTCATTGGCCAGTTGCAGCAGCTTAGGAGATCCACCTCCATACTCTCCTGCTTC^TCTT 

CTTCATCTCCTTCAGTTAGTAGTAACCATGAGAGTAGTGTGATCAATGAGCTTGTTGCTA 

ACTCAAAATCGGCTTTGGCTGATGTGGAAGTGAAGTTTTCAGGAGCTAACGTGCTGCTCA 

AAACGGTGTCGCATAAGATCCCGGGACAAGTTATGAAGATAATTGCTGCTCTTGAAGATT 

TGGCTCTTGAGATTCTTCAGGTTAATATTAACACCGTCGACGAAACCATGCTTAATTCTT 

TCACCATCAAGATTGG7VATTGAGTGCCAACTAAGTGCAGAAGAACTGGCTCAACAAATTC 

AGCAAACATTCTGCTAGTAAAGAAGGATTTAATATAGCTTCGTATAAACCTTAACGAGAG 

AGCAGTACGTACTCACTTTCTCTCCTTAGTATCCCTTTAATTATCTTTTCAGTTTTCTGC 

AAAGATATGGAGTTTAAAAAAATAAAATTGTTATCTAAAGTTTTAATCAAATATTGATTA 

ATTATAACTAATATAGGTATAAGTGAGTTTTAl^AGATTATCAGCTTCATAACAGCCATCG 

TCATGTTTACTTTCTTTTAAATTTTAGAATTTAGACGTACTCCTACCATGTAATT 

TCTGTCATTACATCAAGCATTGTAGCTGTAATTGCATATC 

TGATCTCATGAATAATATTCTTCTTGCAACACAAAAAAAAAAAA 

>G1229 Amino Acid Sequence (domain in AA coordinates: 102-160) 

MQEXIPDFLEECEFVDTSIiAGDDLFAILESLEGAGEISPTAASTPKDGTTSSKELVKDQD 

YENS S PKRKKQRLETRKEEDEEEEDGDGEAEEDNKQDGQQKMSHVTVER1TRRKQMNEHLT 

VLRSLMPCFYVKRGDQASIIGGVVEYISELQQVLQSLEAKKQRKTYAEVLSPRVVPSPRP 

SPPVLSPRKPPLSPRINHHQIHHHLLLPPISPRTPQPTSPYRAIPPQLPLIPQPPLRSYS 

SLASCSSLGDPPPYSPASSSSSPSVSSimESSVIl^LVANSKSALADVEVKFSGAN^ 

WSHKIPGQVMKIIAALEDLALEILQVNINTVDETMLNSFTIKIGIECQLSAEELAQQIQ 

QTFC* 

>G1246 (1..1746) 

ATGATCATGTACGGAGGAGGAGGAGCAGGGAAGGACGGTGGATCC^CCAATCACTTATCA 
GACGGAGGAGTGATATTGAAGAAAGGTCCATGGACGGCGGCGGAAGATGAGATACTTGCT 
GCGTACGTTAGAGAGAACGGTGAAGGGAATTGGAACGCCGTTCAGAAAAACACAGGTTTG 
GCTCGTTGCGGCAAAAGCTGCCGTCTTCGATGGGCCAATCACCTCCGACCAAATCTGAAA 
AAAGGCTCTTTCACCGGTGACGAAGAACGTCTCATCATTCAGCTTCATGCTCAGCTTGGT 
AACAAATGGGCTCGCATGGCTGCTCAGTTACCGGGAAGAACAGACAACGAGATTAAGAAC 
TATTGGAACACGAGATTGAAACGACTTCTTCGCCAAGGACTTCCTCTTTATCCTCCAGAT 
ATTATCCCTAACCATCAACTCCATCCACATCCACATCATCAACAACAACAGCAACATAAC 
CATCATC^TCATCATCATCAACAACAACAACAACAT 

TCTTCACAACGAAACACACCATCATCTTCCCCTCTTCC^TCTCCAACACCAGCAAACGCA 
AAGTCCTCATCATCCTTCACTTTTCATACCACGACTGCTAACCTCCTCCATCCACTTAGC 
CCTCACACTCOtf^C&CACC^T^^ 

TCTCCTTTATGTTCCCCTCGCAACAACCAATACCCGACCCTTCCCCTCTTTGCCCTCCCG 

CGTTCCCAAATCAACAACAACAACAACGGAAATTTCACTTTCCCTAGACCTCCACCTC 

CTTCAACCGCCTTCATCACTCTTCGCAAAACGTTACAACAATGCTAACACTCCTCTTAAT 

TGCATCAACCGCGTCTCAACCGCACCATTTTCCCCTGTTTCAAGAGACTCCTACACTTCC 

-PTTCTTACATTGCCTTACCCTTCCCCAACCGCTCAAACCGCTACTTACCACAATACTAAT 

AACCCTTACTCTTCCTCTCCTTCCTTCTCTTTAAACCCTTCTTCTTCTTCTTACCCTACA 

Ta^CTTCTTCCCCAAGCTTTCTTC^CTCCCATTAC^CTCCTTCTTCCACCTCATTTCAT 

ACC^CCCAGTTTACTCCATGAAACAAGAGCAGCTCCCTTC^^CC^ATTCCCCAAATA 

GATGGCTTCAATAACGTCAACAACTTCACAGACAACGAGAGACAGAATCATAACCTTAAC 

AGTTCCGGTGCTCATAGAAGAAGTAGTAGCTGCAGCCTCTTAGAGGATGTCTTCGAAGAG 

GCCGAAGCTTTAGCCTCTGGAGGCAGAGGCCGACCTCCAAAACGAAGACAACTCACAGCT 

TCTCTTCCGAACCACAACAACAACACCAACAACAACGACAACTTCTTCTCGGTTAGTTTC 

GGACATTATGATTCTTCTGACAACTTATGTTCCTTGCAAGATTTGAAATCAAAGGAAGAA 

GAGTCTCTTCAAATGAACACAATGCAGGAGGACATAGCTAAGCTTCTTGATTGGGGAAGT 

GATAGTGGAGAGATCTCTAATGGACAATCATCTGTTGTCACTGACGACAATCTTGTTCTT 

GATGTTCATCAATTAGCTTCACTATTCCCGGCTGATTCTACAGCCGTCGTAGCCGCAACA 

AACGACCAACACAACAAGAATAATAACAATAATTGTTCCTGGGATGACATGCAGGGAATA 

AGGTAG 

>G1246 Amino Acid Sequence (domain in AA coordinates: 27-139) 
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MI^GGGGAGKDGGSTNHLSDGGVILKKGPWTAAEDEILAAYVRE^ 

ARCGKS CRLRWANHLRPNLKKGS FTGDBERL 1 1 QLHAQLGNKWARMAAQLPGRTDNE I KN 

YWNTRLKRLLRQGLPLYPPDI I PNHQLHPHPHHQQQQQHNHHHHHHQQQQQHQQMYFQPQ 

SSQRNTPSSSPLPSPTPANAKSSSSFTFHTTTANLLHPLSPHTPNTPSQLSSTPPPPPLS 

SPLCSPRMaQYPTLPLFALPRSQINl^NNGNFTFPRPPPLLQPPSSLFAKRYNNANTPIiN 

CINRVSTAPFSPVSRDSYTSFLTLPYPSPTAQTATYHNTmPYSSSPSFSLNPSSSSYPT 

STSSPSFLHSHYTPSSTSFHTNPVYSMKQEQLPSNQIPQIDGFNNVNNFTDNERQNHNLN 

SSGAHRRSSSCSLLEDVFEEAEALASGGRGRPPKRRQLTA^ 

GHYDSSDNLCSLQDLKSKEEESLQMNTMQEDIAKLLDWGSDSGEISNGQSSVVTDDNLVL 
DVHQLASLFPADSTAWAATNDQHNKNNNNNCSWDDMQG I R * 
>G1255 (138.. 1388) 

CAGCTCAAACTCTCTAGGACTACACT7\AATCTAACTTTTTGCAGAGAGCAAAAGATTCAA 
TAATTGAGATTGATCTCAAAACCAAAGCTCTCGTGCTCTTGTCGTTGATGTTGGTTGTGT 
AGACTTTGTATACAATGATGAAAAGTTTGGCGAATGCTGTTGGAGCGAAGACGGCGAGGG 
CTTGCGACAGCTGCGTGAAGAGAGGTGCACGGTGGTACTGCGCGGCCGACGATGCTTTTC 
TTTGCCAGTCTTGCGACAGTTTGGTCCATTCAGCAAACCCTCTTGCTCGCCGCCACGAGA 
GAGTCCGTTTGAAGACGGCTAGCCCGGCGGTCGTAAAGCATAGCAACCACTCATCAGCTT 
CTCCTCCAC^TGAGGTCGCCACGTGGCATC^CGGGTTTACTCGTAAAGCTCGAACGCCAC 
GTGGCTCTGGTAAGAAAAACAATTCGTCGATATTTCATGACTTGGTTCCTGATATTAGTA 
TTGAGGATCAGACAGACAACTATGAGCTTGAAGAGCAGCTGATCTGTCAAGTGCCGGTTC 
TAGATCCGTTGGTGTCTGAGCAGTTCTTGAACGATGTCGTTGAGCCCAAGATCGAGTTTC 
CTATGATCAGAAGTGGTTTGATGATCGAGGAGGAGGAAGACAACGCTGAAAGTTGTCTTA 
ATGGATTTTTCCCGACCGACATGGAGCTTGAGGAGTTTGCTGCTGACGTGGAGACTCTGC 
TCGGTCGCGGGTTAGACACGGAGTCGTATGCCATGGAGGAGCTAGGGTTATCTAATTCAG 
AGATGTTCAAAATCGAAAAAGATGAGATTGAAGAAGAAGTAGAAGAGATAAAAGCCATGA 
GCATGGATATATTTGATGATGATCGAAAAGACGTGGATGGAACAGTACCGTTTGAGCTAA 
GCTTTGATTACGAGTCGTCACACAAGACGTCCGAAGAAGAGGTAATGAAGAACGTTGAAA 
GTAGTGGTGAATGTGTTGTTAAGGTGAAAGAGGAAGAACATAAGAATGTTCTGATGCTAA 
GATTAAACTATGACTCGGTGATATCCACTTGGGGAGGTCAAGGTCCACCGTGGAGTTCAG 
GAGAGCCACCGGAACGAGACATGGACATCAGCGGTTGGCCAGCCTTTTCCATGGTGGAGA 
ATGGAGGAGAAAGTACTCATCAGAAGCAATACGTTGGTGGATGTTTACCATCAAGTGGGT 
TTGGAGATGGAGGTAGAGAAGCTAGAGTTTCGAGATACAGAGAGAAGAGGAGGACAAGGT 
TGTTTTCTAAGAAGATACGGTACGAGGTACGTAAATTGAATGCAGAGAAAAGACCACGAA 
TGAAAGGAAGATTCGTGAAGAGAGCCTCGCTCGCTGCTGCTGCTTCACCATTAGGTGTTA 
ATTACTGAATAGTTAATATCTATTCATGTTATATCTCACTTTACAAATTTCGGTGAATCT 
TTTTTCTTCTGAAACAACAGAAGTTATTTTGGCACTTAATTGTGCTTTGAGGACTTGTAT 
GTACATAGAAGTAACCAATAATAATGTGACTTTTACTA 

>G1255 Amino Acid Sequence (domain in aa coordinates: 18-56) 
MKS LANAVGAKTARACDS CVKRRARWYCAADDAFLCQSCDS LVHSANPLARRHERVRIiKT 
AS PAVVKHSNHSSASPPHEVATWHHGFTRKARTPRGSGKKNNS S I FHDLVPD I S I EDQTD 
NYELEEQLICQVPVLDPLVSEQFLNDVVEPKIEFPMIRSGLMIEEEEDNAESCLNGFFPT 
DMELEEFAADVETLLGRGLDTESYAMEELGLSNSEMFKIEKDEIEEEVEEIKAMSMDIFD 
DDRKDVDGWPFELSFDYESSHKTSEEEVMKNV 

VISTWGGQGPPWSSGEPPERDMDISGWPAFSMVENGGESTHQKQYVGGCIiPSSGFGDGGR 
EARVS RYRE KRRTRL FS KKI R YE WKLNAEKRPRMKGRFVKRAS LAAAAS PLGVNY * 
>G1304 (1..978) 

ATGGGGCGATCACCATGTTGCGATGAGAATGGTCTAAAGAAAGGGCCATGGACACAAGAG 
GAGGATGATAAACTOATAGATCA(^TTCAAAAACATGGCCATGGCAGCTGGAGAGCTCTT 
CCAAAGCAAGCCGGTTTAAACCGATGCGGAAAGAGTTGTAGATTAAGATGGACCAACTAC 
TTGAGACCTGACATCAAGAGAGGAAATTTCACTGAAGAGGAAGAACAAACTATTATCAAC 
CTCCATTCCCTTCTTGGT^AACAAGTGGTCGTCGATAGCCGGTAATCTTCCTGGAAGAACG 
GACAATGAAATAAAAAACTATTGGAACACACATTTGAGAAAGAAACTTCTCCAAATGGGG 
ATTGATCCGGTGACCCATAGGCCAAGAACCGACCATCTAAACGTTTTAGCAGCTCTCCCG 
CAGCTTATAGCCGCCGCAAATTTCAACAGCCTCT^ 

GATGCAACAACTCTTGCTAAAGCTCAACTGCTACACACTATGATTCAAGTCCTTAGCACC 
AATAACAACACCACCAATCCTTCTTTTTCTTCATCAACTATGCAAAACAGTAACACCAAT 
CTCTTTGGCCAAGCTTCTTACTTAGAGAACCAAAATCTTTTTGGTCAGTCTCAAAACTTC 
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TCTCACATTCTTGAGGATGAGAATTTGATGGT 
GACTCTTTTTCTTCCCCCATACAACCCGGTTT^ 

TTGGTTCCGGCGTCTCCTGAAGAATCTAAAGAAACTCAAAGGATGATCAAGAACAAAGAC 
ATCGTCGATTACCATCATCATGATGCTTCAAACCCTTCATC7VTCAAACTCAACGTTTACA 
CAAGATCATCATCACCCATGGTGTGACACTATTGATGATGGAGCAAGTGATTCTTTTTGG 

AAAGAGATAATAGAGTAA 

>G1304 Amino Acid Sequence (conserved domain in AA coordinates : 13-118) 
MGRSPCCDENGLKKGPWTQEEDDKLIDHIQKHGHGSWRALPKQAGLNRCGKSCRLRWTOT 
LRPDIKRGNFTEEEEQTI INIjHSLLGNKWS S IAGMjPGRTDNEIKNYWNTHLRKKLLQMG 
IDPVTHRPRTDHLNVLAALPQL IAAANFNSLLNLNQNVQLDATTLAKAQLIjHTMI QVLST 
NNNTTNPSFSSSTMQNSNT^FGQASYLENQNLFGQSQOT 

DSFSSPIQPGFQDDHNSLPLLVPASPEESKETQRMIKNKDIVDYHHHDASNPSSSNSTFT 
QDHHHPWCDTIDDGASDSFWKEI IE* 
>G1318 (7.. 849) 

AAAAATATGAGGAAGCCAGAGGTAGCCATTGCAGCTAGTACTCACCAAGTAAAGAAGATG 
AAGAAGGGACTTTGGTCTCCTGAGGAAGACTCAAAGCTGATGCAATACATGTTAAGCAAT 
GGACAAGGATGTTGGAGTGATGTTGCGAAAAACGCAGGACTTCAAAGATGTGGCAAAAGC 
TGCCGTCTTCGTTGGATCAACTATCTTCGTCCTGACCTCAAGCGTGGCGCTTTCTCTCCT 
CAAGAAGAGGATCTGATC^TTCGCTTTCATTCCATCCT 

GCAGCACGATTGCCTGGTCGGACCGATAACGAGATCAAGAATTTCTGGAACTCAACAATA 
AAGAAAAGGCTAAAGAAGATGTCCGATACCTCCAACTTAATCAACAACTCATCCTCATCA 
CCCAAC^CAGCAAGCGATTCCTCTTCTAATTCCGC^TCTTCTTTGGATATTAAAGACATT 
ATAGGAAGCTTCATGTCCTTACAAGAACAAGGCTTCGTCAACCCTTCCTTGACCCACATA 
CAAACCAACAATCCATTTCCAACGGGAAACATGATCAGCCACCCGTGCAATGACGATTTT 
ACCCCTTATGTAGATGGTATCTATGGAGTAAACGCAGGGGTACAAGGGGAACTCTACTTC 
CCACCTTTGGAATGTGAAGAAGGTGATTGGTACAATGCAAATATAAACAACCACTTAGAC 
GAGTTGAACACTAATGGATCCGGAAACGCACCTGAGGGTATGAGACCAGTGGAAGAATTT 
TGGGACCTTGACCAGTTGATGAACACTGAGGTTCCTTCGTTTTACTTCAACTTCAAACAA 
AGCATATGAATATTTTTACGTCATCTTATTCTTTTTTCTATTGCGGTTTATACTCAAGAT 
TCTTAGCCACACACACATAAATGCAAATATATATACATTGTTAGAGAGTATTTTGTATTT 
CGTATAATCTTTTCGTACTAGGGCTTGAGCCTTGAGGTCCCATGTAACGATTAGTCAATG 
TAAAACATATATCCTATAATAAATAAATAAAAGAAATAATAAGCACATAAAAAAAAAAAA 

A 

>G1318 Amino Acid Sequence (domain in AA coordinates: 20-123) 

MRKPE VAIAASTHQVKKMKKGLWS PEEDS KLMQ YM^ CR 

IiRWINYLRPDLKRGAFSPQEEDLI IRFHS ILGNRWSQIAARLPGRTDNEIKNFWNSTI KK 

RLKKMSDTSNLINNSSSSPNTASDSSSNSASSLDIKDIIGSFMSLQEQGFVNPSLTHIQT 

NNPFPTGNMISHPCNDDFTPYVDGIYGVNAGVQGEL^^ 

imTGSGNAPEGMRPVEEFWDLDQLMNTEVPSFYFNFKQSI* 

>G1320 (39.. 788) 

GAAGATCATAAAGATCAAAAGGAGAGAGGTATTAAAAAATGATGTGTAGTCGAGGCCATT 
GGAGACCTGCAGAAGACGAGAAGCTAAGAGAACTCGTCGAGCAATTTGGTCCTCATAATT 
GGAACGCCATAGCTCAGAAGCTCTCTGGTCGATCTGGTAAGAGTTGTAGATTGAGATGGT 
TTAATCAATTGGATCCTAGGATTAACCGAAACCCTTTCACGGAGGAAGAAGAAGAAAGGC 
TTTTAGCGCCTCATCGGATCCATGGGAACAGATGGTCTGTGATCGCTAGATTTTTTCCCG 
GTCGAACTGATAACGCTGTTAAAAACCATTGGCACGTCATCATGGCTCGTCGTGGCCGAG 
AACGGTCCAAGCTCCGTCCACGAGGCCTTGGCCATGATGGCACGGTGGCTGCGACTGGGA 
TGATTGGTAATTATAAAGACTGCGATAAGGAGAGAAGATTGGCAACCACAACCGCTATCA 
ATTTTCCTTATCAATTCTCTCATATTAATCATTTTCAAGTCCTCAAAGAGTCCTTGACCG 
GAAAGATCGGGTTCAGAAATAGTACTACTCCAATACAAGAAGGAGCAATAGACCAAACTA 
AACGACCGATGGAGTTCTACAATTTTCTCCAAGTAAACACGGATTCGAAGATACACGAAT 
TGATAGATAATTCAAGAAAAGACGAAGAAGAAGATGTCGATCAAAACAACCGAATTCGTA 
ACGAGAATTGTGTTCCATTTTTCGACTTTTTGTCTGTTGGAAACTCTGCCTCTCAGGGTT 
TATGTTAATTTGTCCGTACCACATGTACTATAAGGTGGACCATATGTTAACTAAAGATAA 
TGTAGAAAGTACTAATCAATTAGAGCTCCTGTTTGAGCCA AATG TGAAAATTAGTTAAGA 
(^TCCCAAACATTTTCTTGTATAACACATATAA 

CTATTTTTATTTTAAGGATGTTTAATCAGACCCATAACCATTCGATAAAAAAAAAAAAAA 
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>G1320 Amino Acid Sequence (domain in AA coordinates: 5-108) 
MMCSRGHWRPAEDEKLRELVEQFGPHNWNAI 

TEEEEERLLAPHRIHGlimWSVIARFFPGRTDNAVKNHWHVIMARRGRER^ 

GTVAATGMIGNYKDCDKERRLATTTAINFPYQFSHINHFQVLKESLTGKIGFRNSTTPIQ 

EGAIDQTKRPMEFYNFLQVNTDSKIHELIDNSRKDEEEDWQNMIRNENCVPFFDFLSV 

GNSASQGLC* 
>G1330 (36.-959) 

GTACCGGCGACCTCTTTGTGGGTCACTCTTCATCAATGGGTGACAAAGGAAGGAGCTT^ 

AGATCAACAAGAACATGGAGGAATTCACGAAAGTGGAAGAAGAAATGGACGTAAGGAGAG 

GTCCATGGACAGTTGAGGAAGATTTAGAGCTCATCAATTACATTGCTAGTCATGGTGAAG 

GTCGATGGAACTCTCTCGCTCGTTGCGCCGAACTCAAAAGGACCGGAAAAAGCTGCAGAC 

TTCGGTGGCTGAACTATCTCCGACCAGATGTGCGCCGTGGAAACATAACCCTCGAAGAAC 

AACTCTTGATTCTTGAACTTCACACACGTTGGGGCAATAGATGGTCTAAGATTGCACAAT 

ATTTACCAGGAAGAACGGATAACGAGATCAAAAACTATTGGAGAACACGTGTTCAAAAGC 

ATGCAAAACAGCTTAAATGCGACGTGAACAGTCAACAATTTAAAGACACCATGAAGTATC 

TTTGGATGCCTCGGCTCGTAGAAAGGATCCAAGCCGCGTCCATCGGGTCTGTTTCCATGT 

CATCTTGCGTCACCACCTCCTCAGATCAGTTCGTGATCAACAACAACAA(^ 

TGGATAATTTGGCTTTAATGAGTAACCCTAATGGTTACATCACGCCGGATAATTCCAGCG 

TGGCAGTATCTCCTGTATCAGATTTGACGGAGTGTCAAGTGAGTAGTGAAGTGTGGAAGA 

TTGGTCAGGATGAGAATTTGGTGGATCCAAAAATGACATCGCCGAATTATATGGATAATA 

GCAGTGGACTATTAAACGGAGATTTTACGAAGATGCAAGATCAAAGTGACCTTAATTGGT 

TTGAAAATATTAATGGGATGGTACCAAATTATTCGGACAGTTTTTGGAACATTGGAAATG 

ATGAAGACTTCTGGCTCTTACAACAACATCAACAAGTCCACGACAATGGAAGCTTCTGAA 

TAGACAAGAAGCTATGCGGCC 

>G1330 Amino Acid Sequence (domain in AA coordinates: 28-134) 
MGDKGRSLKINKNMEEFTKVEEEMDVRRGPWTVE 

KRTGKSCRLRWLNYLRPDVRRGNITLEEQLLILELHTRWGNRWSKIAQYLPGRTDNEIKN 

YWRTRVQKHAKQLKCDWSQQFKDTMKYLWMPRLVERIQAASIGSVSMSSCVTTSSDQF 

INNNNTNITTONLALMSNPNGYITPDNSSVA 

TSPNY^NSSGLLNGDFTKMQDQSDLNWFENINGMVPNYSDSFWNIGNDEDFWLLQQHQQ 

VHDNGSF* 

>G1352 (79.. 900) 

GCGCGATTAAAAACTCTCAACTTTTCTCTCAAATTTCTGATCCTTTGATCCAACAGTTAG 
AAGAAGATTCATCTGATCATGGCCCTCGAAGCGATGAACACTCCAACTTCTTCTTTCACC 
AGAATCGAAACGAAAGAAGATTTGATGAACGACGCCGTTTTCATTGAGCCGTGGCTTAAA 
CGCAAACGCTCCAAACGTCAGCGTTCTCACAGCCCTTCTTCGTCTTCTTCCTCACCGCCT 
CGATCTCGACCCAAATCCCAGAATCAAGATCTTACGGAAGAAGAGTATCTCGCTCTTTGT 
CTCCTCATGCTCGCTAAAGATCAACCGTCGCAAACGCGATTTCATCAACAGTCGCAATCG 
TTAACGCCGCCGCCAGAATCAAAGAACCTTCCGTACAAGTGTAACGTCTGTGAAAAAGCG 
TTTCCTTCCTATCAGGCTTTAGGCGGTCACAAAGC^ 

GTAATCTCAACAACCGCCGATGATTCAACAGCTCCGACCATCTCCATCGTCGCCGGAGAA 
AAACATCCGATTGCTGCCTCCGGAAAGATCCACGAGTGTTCAATCTGTCATAAAGTGTTT 
CCGACGGGTCAAGCTTTAGGCGGTCACAAACGTTGTCACTACGAAGGCAACCTCGGCGGC 
GGAGGAGGAGGAGGAAGCAAATCAATCAGTCACAGTGGAAGCGTGTCGAGCACGGTATCG 
GAAGAAAGGAGCCACCGTGGATTCATCGATCTAAACCTACCGGCGTTACCTGAACTCAGC 
CTTCATCACAATCCAATCGTCGACGAAGAGATCTTGAGTCCGTTGACCGGTAAAAAACCG 
CTTTTGTTGACCGATCACGACCAAGTCATCAAGAAAGAAGATTTATCTTTAAAAATCTAA 
TACTCGACTATTAAOTCTTGTGTGATTTTTTTCGTTACAACCATAGTTTCATTTTCATTT 
TTTTAGTTACAAATTTTTAATTGTTCTGATTTGGATTGAAA 

>G1352 Amino Acid Sequence (domain in AA coordinates: 108-129,167-188) 

MALEAMNTPTS S FTR I ETKEDLMNDAVFI E P WLKRKRS KRQRSHS PS S S S S S PPRSRPKS 

QNQDLTEEEYLALCLLMLAlO)QPSQTRFHQQSQSLTPPPESKNLPYKC^CEKAFPSYQA 

LGGHKASHRIKPPTVISTTADDSTAPTISIVAGEKHPIAASGKIHECSICHKVFPTGQAL 

GGHKRCHYEGNLGGGGGGGSKSISHSGSVSSTVSEERSHRGFIDLNLPALPELSLHHNPI 

VDEEILSPLTGKKPLLLTDHDQVIKKEDLSLKI * 

>G1354 (1..1047) 

ATGGAAAGTCTCGCACACATTCCTCCCGGTTATCGATTCCATCCGACCGATGAAGAACTC 
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GTTGACTATTATCTCAAGAACAAAGTTGCATTCCCGGGAATGCAAGTTGATGTTATCA^ 
GATGTTGATCTCTACAAAATCGAGCCATGGGACATCCAAGAGTTATGTGGAAGAGGGACA 
GGAGAAGAGAGGGAATGGTATTTCTTTAGCCACAAGGACAAGAAATATCCAACTGGGACA 
CGAACCAATAGAGCAACGGGCTCCGGATTTTGGAAAGCAACGGGTCGAGACAAGGCCATT 
TACTCAAAGCAAGAGCTTGTTGGGATGAGGAAGACTC1TGTCTTTTACAAAGGTAGGGCC 
CCAAATGGTCAGAAATCTGATTGGATAATGCACGAATACCGTCTTGAGACCGATGAAAAT 
GGACCGCCTCATGAGGAAGGATGGGTGGTTTGTCGCGCTTTC 

ATGAACTACAACAATCCAAGAACAATGATGGGATCATCATCAGGCCAAGAATCTAACTGG 
TTCACGCAGCAAATGGATGTGGGGAATGGTAATTACTATCATCTTCCTGATCTAGAGAGT 
CCGAGAATGTTTCAAGGCTCATCATCATCATCACTATCATCATTACATCAGAATGATCAA 
GACCCTTATGGTGTCGTACTCAGC^CTATTAACGCAACCCCAACTACAATAATGCAACGA 
GATGATGGTCATGTGATTACCAATGATGATGATCATATGATCATGATGAACACAAGTACT 
GGTGATCATC^TCAATCAGGATTACTAGTCAATGATGATCATAATGATCAAGTAATGGAT 
TGGCAAACGCTTGACAAGTTTGTTGCTTCTCAGCTAATCATGAGCCAAGAAGAGGAAGAA 
GTTAACAAAGATCCATCAGATAATTCTTCGAATGAAACATTTCATCATCTCTCTGAAGAG 
CAAGCTGCAACAATGGTTTCGATGAATGCTTCTTCCTCTTCTTCTCCATGTTCCTTCTAC 

TCTTGGGCTCAAAATACACACACGTAA 

>G1354 Amino Acid Sequence (domain in AA coordinates: TBD) 
MESLAHIPPGYRFHPTDEELVDYYLK^AFPGMQVDVIKDVDLYKIEPWDIQELCGRGT 

GEEREVTCFFSHKDKKyPTGTRTl^TGSG 

PNGQKSDWIMHEYRLETDENGPPHEEGWWCRAFKKKLTTMNYl^ 
FTQQ^VGNGNYYHLPDLESPRMFQGSSSSSLSSLHQlTIX5DPYGVVIiSTINATPTTIMQR 
DIX3HVITNDDDHMIMMNTSTGDHHQSGLLVNDDHNDQVMDWQT 
VNKDPSDNSSNETFHHLS EEQAATMVSMNAS S S SS PCS FYS WAQNTHT * 
>G1360 (1..1257) 

ATGGGAGATAGAAACAACGACGGTGATCAGAAAATGGAGGATGTATTGTTGCCCGGATTT 

AGGTTTCATCCAACCGACGAAGAGCTCGTAAGCTTCTACCIX3AAGCGGAAGGTTCAAC 

AACCCTCTCTCCATTGAGCTCATAAGACAACTCGATATCTACAAATATGACCCCTGGGAT 

CTTCCAAAGTTTGCGATGACGGGTGAAAAAGAATGGTACTTTTATTGTCO^GGGACAGG 

AAGTATAGGAACAGCTCGAGGCCAAACCGAGTGACCGGAGCTGGTTTTTGGAAAGCCACG 

GGAACGGACCGGCCGATATACTCGTCAGAAGGAAACAAATGCATAGGTTTAAAGAAGTCC 

TTAGTGTTCTACAAAGGAAGAGCAGCGAAAGGAGTTAAGACTGATTGGATGATGCATGAG 

TTTCGTTTGCCTTCTCTCTCCGAACCATCTCCTCCTTCTAAGAGATTCTTCGACTCTCCT 

GTCTCTCCC^CGATTCATGGGCTATATGCAGAATCTTCAAAAAGACCAACACAACGACC 

ATGTCTAACCAAAAGCAATCAAACACATACCATTTTTCTTCAGACAAGATCCTCAAACCT 

AGCTCTCACTTCCAGTTTCACCATGAGAATATGAACACTCCCAAAACTAGTAATAGTACA 

ACTCCATCCGTTCCC^CTATAAGTCCCTTCTCTTACTTGGATTTCACTTCATACGACA 

CCCACCAACGTTTTCAATCCGGTTTCATGTTTAGACCAACAATACCTCACAAATCTCTTT 

CTTGCCACACAAGAAACACAACCTGAGTTTC^ 

TCGTTTCTGCTAAACACGTCTTCAGATTCGACCTTCTTGGGAGAATTCACGAGCCATATC 

GACCTCAGCGCAGTGTTGGCCCAAGAGCAATGTCCCCCGCTTGTAAGCCTACCACAGGAG 

TATCAAGAGACGGGATTCGAAGGAAATGGTATAATGAAGAACATGCGTGGTTCCAATGAA 

GATCATCTTGGTGATCATTGCGACACACTTCGGTTTGATGATTTCACTTCAACAATTAAT 

GAGAACCATCGTCATCATCAAGACCTGAAACAGAACATGACATTGCTGGAGAGTTATTAT 

TCTTCTTTATCGTCCATCAATAGCGATTTGCCAGCTTGTTTCTCCAGTACAACCTGA 

>G1360 Amino Acid Sequence (conserved domain in AA coordinates: 

MGDRimDGDQKMEDVLLPGF 

LPKFAMTGEKEWYTYCPRDRKYRNSSRPNRVTGAGFWKATGTDRPI YS SEGNKC IGLKKS 

LVTYKGRAAKGVKTDWMMHEFRIiPSLSEPSPPSKRFFDSPVSPNDSWAICRIFKKTNTTT 

LRALSHSFVSSLPPETSTDTMSNQKQSNTYHFSSDKILKPSSHFQFHHENMNTPKTSNST 

TPSVPTISPFSYLDFTSYDKPTNVFNPVSCLDQQYIjTNLFLATQETQPQFPRIiPSSNEIP 

SFLliNTSSDSTFLGEFTSHIDLSAVLAQEQCPPLVSLPQEYQETGFEGNGIMKNMRGSNE 

DHLGDHCDTLRFDDFTSTINENHRHHQDLKQNMTLLESYYSSLSSINSDLPACFSSTT* 

>G1364 (1..537) 

ATGGCGGAGTCGCAGGCCAAGAGTCCCGGAGGCTGTGGAAGCCATGAGAGTGGTGGAGAT 
CAAAGTCCCAGGTCGTTACATGTTCGTGAGCAAGATAGGTTTCTTCCGATTGCTAACATA 
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AGCCGTATCATGAAAAGAGGTCl^CCTGCTAATGGGAAAATCGCTAAAGATGCTAAGGAG 

ATTGTGCAGGAATGTGTCTCTOAATTCATCAGTTTCGTCACCAGCGAAGCGAGTGATAAA 

TGTCAAAGAGAGAAAAGGAAGACTATTAATGGAGATGATTTGCTTTGGGCAATGGCTACT 

TTAGGATTTGAAGACTACATGGAACCTCTCAAGGTTTACCTGATGAGATATAGAGAGGGT 

GACACAAAGGGATCAGCAAAAGGTGGGGATCCAAATGCAAAG 

CAAAATGGCCAGTTCTCGCAGCTTGCTCACCAAGGTCCT^ 

TTTCCTCTCTTCTCTTCACACTCAAGCAATACGCATCATTCTCTTCTAATTTGTTAA 

>G1364 Amino Acid Sequence (conserved domain in AA coordinates : 29-120) 

MAESQAKSPGGCGSHESGGDQSPRSLHVI^QDRFLPIANISRIMKRGIjPANGKIAKDAKE 

IVQECVSEFISFVTSEASDKCQREKRKTINGDDLLWAMATL^ 

DTKGSAKGGDPNAKKDGQSSQNGQFSQLAHQGPYGNSQVTFPLFSSHSSNTHHSLLIC* 
>G1379 (68.. 622) 

CTCTGCCTCTCTCTCTCTCTCAAAACCCATCTCGAAAGTCTTTCTCTTTCGAGGGTTTAG 
ATCCTCCATGGAAGGCGGCGGAGTTGCTGACGTGGCTGTCCCCGGTACGAGGAAGAGAGA 
CAGACCHTACAAAGGAATTAGGATGAGGAAGTGGGGAAAGTGGGTGGCGGAGATTCGTGA 
GCCTAACAAGCGCTCTAGGTTATGGCTTGGCTCTTACTCTACTCCCGAGGCGGCGGCGCG 
AGCTTACGACACGGCGGTTTTCTATCTTAGAGGACCTACGGCGAGGCTTAACTTCCCTGA 
GCTTCTTCCTGGGGAGAAATTCTCCGACGAGGATATGTCGGCTGCGACCATCAGGAAGAA 
AGCCACGGAGGTCGGTGCTCAGGTTGATGCTTTGGGCACGGCGGTGCAAAATAACCGCCA 
C03TGTTTTTGGTCAGAATCGAGATAGTGATGTGGATAATAAGAATTTTCATCGGAATTA 
TCAAAACGGTGAACGAGAAGAAGAAGAAGAAGATGAGGATGACAAGAGATTGAGGAGTGG 
CGGCCGGTTATTGGATCGGGTTGACTTGAATAAATTACCCGACCCGGAAAGCTCCGATGA 
AGAATGGGAAAGCAAACATTAAAAATATATAGTTTGGAGCGGTGGCTGTTGCTAACGTAC 
GCCAACGGCTTGCTTCTACGAATCATTAGCGCCGTTTATGA^ 

TGCGAGTTTTGCGGTTTATGGAATTTTAGGCTATTGCTTAACGAAAAAAAAAAAAAAAA 

>G1379 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MEGGGVADVAVPGTRKRDRPYKGIRmKWGKWVAEIREPNKRSRLWLGSYSTPEAAARAY 

DTAVFYLRGPTARLNFPELLPGEKFSDEDMSAATIRKKATEVGAQVTDALGTAVQNNRHRV 

FGQNRDSDTONKNFHRITYQNGEREEEEEDEDDKRL^^ 

ESKH* 

>G1384 (33.. 977) 

GCGGCGAGCTTATGGAAGCACTTCAACCTTTTTACAAAAGTGCTTCCACGTCTOCTTC^ 

ATCCTGCGTTTGCGTCCTCAAACGATGCGTTTGCGTCTGCCCCAAACGACCTATTTTCTT 

CTTCTTCTTACTATAATCCTCATGCATCrrTTATTCCClTC^CATTCCAC^CCTCTTACC 

CGGATATTTATTCTGGATCCATGACCTATCCATCTTCATTCGGGTCGGATCTTCAACAAC 

CCGAAAACTACCAATCTC^GTTCCATTACCAAAACACTATCACTTACACTCACCAAGACA 

AC^CACTTG^TGCTTAACTTCATTGAGCCGAGCCAACCGGGTTTTATGACCCAACCGG 

GTCCGAGTTCGGGTTCGGTTTCAAAACCGGCTAAGCTCTATAGAGGAGTGAGGCAAAGAC 

ATTGGGGAAAATGGGTCGCGGAGATCCGTTTACCCAGGAACCGAACCCGACTTTGGCTCG 

GAACATTCGACACGGCTGAAGAAGCCGCGTTGGCTTATGATCGCGCCGCGTTTAAGCTTC 

GTGGTGACTCGGCTCGGCTTAACTTCCCAGCTCTCCGATACCAAACCGGCTCGTCTCCGT 

CTGATACCGGCGAATATGGTCCTATTGAAGCTGCCGTAGACGCTAAACTAGAAGCCATAT 

TAGCTGAGCCGAAGAATCAGCCGGGCAAAACGGAGAGGACGTCGAGGAAACGAGCTAAAG 

CCGCGGCTTCTTCAGCTGAGCAGCCGTCAGCGCCACAACAACATTCCGGGTCGGGTGAAA 

GTGATGGGTCGGGTTCACCGACTTCGGATGTTATGGTGCAGGAGATGTGCCAAGAGCCAG 

AGATGCCATGGAAT6AAAATTTCATGCTCGGCAAGTGTCCTTCTTATGAGATAGATTGGG 

CTTCAATTTTATCGTGAAAAATTAGGATTCAATT 

TATTTTCTTTTAAACTTTAGGGTTATTAGCTGTGCGTAA 

>G1384 Amino Acid Sequence (domain in AA coordinates: TBD) 

MADLFGGGHGGEIjMEALQP F YKS ASTS ASNP AFAS SNDAFAS APNDLFS S S S YYNPHAS L 

FPSHSTTSYPDIYSGSMTYPSSFGSDLQQPENYQSQFHYQNTITYTHQDNNTCMLNFIEP 

SQPGFMTQPGPSSGSVSKPAKLYRGVllQRHWGKWVAEIRLPRNRTRLWLGTFDTAEEA^ 

AYDRAAFKLRGDSARLNFPALRYQTGSSPSDTGEYGPIQAAVDAKLEAILAEPKNQPGKT 

ERTSRKRAKAAASSAEQPSAPQQHSGSGESDGSGSPTSDVMVQEMCQEPEMPWNENFMLG 

KCPSYEIDWASILS* 



75 



WO 03/013227 



76/286 



PCT/US02/25805 



>G1399 (261.. 1475) 

AGGTCGAATTTTCTGAAATTAAGATTGATTCCTC 

CTTTAGCTTAGCTTAGCTTCTACTGATCTGTTTTTGCTACAAAATCCCATCTTrTrCTTT 
AAAACTCTTTATCTCTGAATCTTGAGTTTCTTGTAGAAGAAGAAGCAATTTTGAATCTTT 
CGTAATCATAAAGATTCGTGGAGGATCTCTACTGATTTGTCGGAATCTCTCACTACAGAA 
TCACTTGATCTTATGTCCGGATGGAGGAGAGAGAAGGAACC^CATC^CAACAACATCA 
CTAGCAGTTTCGGCTTGAAGCAGCAACATGAAGCTGCTGCTTCTGATGGTGGTTACTCAA 
TGGACCCACCACCAAGACCCGAAAACCCTAACCCGTTTTTAGTCCCACCCACTACTGTCC 
CCGCGGCCGCC^CCGTAGCAGCAGCTGTTACTGAGAATGCGGCTACTCCGTTTAGCTTAA 
CAATGCCGACGGAGAACACTTCAGCTGAGCAGCTGAAAAAGAAGAGAGGTAGGCCGAGAA 
AGTATAATCCCGATGGGACTCTTGTCGTGACTTTATCGCCGATGCCAATCTCGTCCTCTG 
TTCCGTTGACGTCGGAGTTTCCTCCAAGGAAACGAGGAAGAGGACGTGGCAAGTCTAATC 
GATGGCTCMGAAGTCrrC^^TGTTCCAATTCGATAGAAGTCCTGTTGATACCAATTTGG 
CAGGTGTAGGAACTGCTGATTTTGTTGGTGCCAACTTO 

ACGCCGGAGAGGATGTGACGATGAAGATAATGACATTCTCTCAACAAGGATCTCGTGCTA 
TCTGCATCCTTTCAGCTAATGGTCCCATCTCCAAT 

CCGGTGGTACTCTAACTTATGAGGGTCGTTTTGAGATTCTCTCTTTGACGGGTTCGTTTA 

TGCAAAATGACTCTGGAGGAACTCGAAGTAGAGCTGGTGGTATGAGTGTTTGCCTTGCAG 

GACCAGATGGTCGTGTCTTTGGTGGAGGACTCGCTGGTCTCTTTCnTGCTGCTGGTCCTG 

TCCAGGTAATGGTAGGGACTTTTATAGCTGGTCAAGAGCAGTaVCAGCTGGAGCTAGCAA 

AAGAAAGACGGCTAAGATTTGGGGCTCAACCATCTTCTATCTCCT^ 

AAGAACGGAAGGCGAGATTCGAGAGGCTTAAC^GTCTGTTGCTATTCCTGCACCAACCA 

CTTCATACACGGATGTAAAGACAACAAATG 

ACCATGTCAAGGATCCCTTCTCGTCTATCCCAGTAGGAGGAGGAGGAGGTGGAGAGGTAG 
GAGAAGAAGAGGGTGAAGAAGATGATGATGAATTAGAAGGTGAAGACGAAGAATTCGGAG 
GCGATAGCCAATCTGACAACX5AGATTCCGAGCTGATGATGATCATACGGTTTCTTTTCGC 
GGATTTGTTAGGTTTGATGGATTTCAGATTTTGGTTGATTGTTTTTATTAACACAGAATG 
XTTAGAAGCTGCTATCTITAGG1TCCCATCCTCTTGTGATTGTTGAGTATCCTTGTTAGA 
AACAAACTTACTGTTGCAAAACTCTCTTCAAA 

>G1399 Amino Acid Sequence (domain in AA coordinates: 86-93) 

MEEREGTNIN1^ITSSFGLKQQHEAAASDGGYSM)PPPRPENPNPPLVPPTWPAAATVA 

AAVTENAATPFSLTMPTENTSAEQLKKKRGRPRK™^ 

PPRKRGRGRGK5NRWLKKSQMFQFDRSPVDTNIAGVGTADFVGANFTPHVLIWAGEDVT 

MKIMTFSQQGSRAICILSANGPISNVTLRQSMTSGGTLTYEGRFEILSLTGSFMQNDSGG 

TRSRAGGMSVCLAGPDGRVFGGGIjAGLFLAAGPVQVMVGTFIAGQEQSQLEIiAKERRLRF 

GAQPS S I SFNI S AEERKARFERLNKS VAI PAPTTS YTHVNTTNAVHS YYTNS VNHVKDPF 

SSIPVGGGGGGEVGEEEGEEDDDELEGEDEEFGGDSQSDNEIPS* 

>G1415 (60.. 680) 

CCTTATCACTC^CCAAAAGTCGTCACATAATATCACTTTCGAGTTATCAACATCCGTACZA 
TGTCATCCATAGAGCCAAAAGTAATGATGGTTGGTGCTAATAAGAAACAACGAACCGTCC 
AAGCTAGTTCGAGGAAAGGTTGTATGAGAGGAAAAGGTGGACCCGATAACGCGTCTTGCA 
CTTACAAAGGTGTTAGACAACGCACTTGGGGCAAATGGGTCGCTGAGATCCGCGAGCCTA 
ACCGAGGAGCTCGTCTTTGGCTCGGTACCTTCGACACCTCCCGTGAAGCTGCCTTGGCTT 
ATGACTCCGCAGCTCGTAAGCTCTATGGGCCTGAGGCTCATCTCAACCTCCCTGAGTCCT 
TAAGAAGTTACCCTAAAACGGCGTCGTCTCCGGCGTCCCAGACTACACCAAGCAGCAACA 
CCGGTGGAAAAAGCAGCAGCGACTCTGAGTCGCCGTGTTCATCCAACGAGATGTCATCAT 
GTGGAAGAGTGACAGAGGAGATATCATGGGAGCATATAAACGTGGATTTGCCGGTAATGG 
ATGATTCTTCAATA^^GAAGAAGCTACAATGTCGTTAGGATTTCCATGGGTTCATCAAG 
GAGATAATGATATTTCTCGGTTTGATACTTGTATTTCCGGTGGCTATTCTAATTGGGATT 
CCTTTCATTCCCCACTTTGAGGTGTCACTAGACTCTCTTTAATTGTTAAGTTATCATATA 
CAAACTACATATATATACAAATATAGTCACCGTGAACTAGGATATATATGTAAATTU^ACA 
CCAGTTACATGTACTTATATATGTGCACATCTATATATGTGGTTTGTCTGTATAGTGTGA 
AAGCAGATTCTTACCATATCA 

>G1415 Amino Acid Sequence (domain in AA coordinates: TBD) 
MS S IEPKVMMVGANKKQRTVQAS SRKGCMRGKGGPDNASCTYKGVRQRTWGKWVAE IREP 
NRGARLWLGTFDTSREAALAYDSAARKLYGPEAHLNLPE SLRS YPKTAS S PASQTTPSSN 
TGGKSSSDSESPCSSNEMSSCGRVTEEISWEHINVDLPVMDDSSIWEEATMSLGFPWVHE 
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GDNDISRFDTCISGGYSNWDSFHSPL* 
>G1417 (32.. 1501) 

TCTATCTCTATOTATCTCTCTTTGTCTGCAAATGGAAGAACATATTCAAGATCGCCGTGA 
AATTGCGTTCTTACACTCAGGAGAATTTCTCCACGGAGATTCTGACTCAAAGGATCATC^ 
ACCGAACGAGTCTCCGGTGGAACGTCATCACGAGTCGTCTATCAAAGAAGTTGATTTCTT 
CGCTGCTAAAAGTCAGCCGTTTGATCTTGGTCATGTGAGAACAACGACGATCGTTGGATC 
ATCTGGTTTTAATGATGGATTAGGTTTGGTAAATTCATGTCATGGAACATCAAGCAATGA 
TGGCGATGACAAAACCAAAACTCAAATTAGTAGACTGAAGTTGGAGCTAGAGAGGCTTCA 
CGAGGAGAATCACAAACTGAAGCATTTATTAGATGAGGTCAGTGAGAGTTACAACGACCT 
C CAAAGAAGAGTTTTGTTAG CAAGACAAACACAAGTGGAAGGTCTTCATCATAAAC AACA 
TGAGGATGTACCTCAAGCTGGTTCCTCACAAGCTCTAGAGAACAGAAGACCAAAGGATAT 
GAACCATGAAACTCCGGCCACCACCTTGAAACGACGGTCTCCAGACGACGTGGATGGTCG 
TGATATGCACCGAGGATCACCAAAAACTCCTCGAATAGACCAAAACAAGAGTACTAATCA 
TGAAGAACAACAAAACCCTCATGATCAATTACCCTATAGAAAAGCTAGGGTTTCCGTTAG 
AGCTAGATCTGATGCCACTACGGTAAATGACGGATGTCAATGGAGAAAATACGGTCAGAA 
AATGGCGAAAGGGAATCCATGTCCTCGCGCTTATTATCGTTGCACCATGGCCGTTGGATG 
TCCTGTCCGTAAACAGGTCCAACGATGCGCGGAGGATACAACTATCTTGACAACAACGTA 
CGAAGGAAACCATAACCATCCTCTTCCCCCGTCAGCCACAGCCATGGCTGCAACCACCTC 
CGCCGC^GC^GCCATGCTCTTATCAGGCTCCTCCTCCAGC^CCTCCACC^AACACTCTC 
TAGCCCCTCCGCCACGTC^TCATCATCCTTCTACCATAACTTCCCATACACCTCCACAAT 
CGCAACACTCTCTGCCTCAGCTCCTTTCCCCACCATAACCTTAGACCTCACCAACCCACC 
TCGACCGCTAGAACCGCCACCGCAGTTTCTAAGCCAGTATGGTCCCGCCGCGTTTTTACC 
AAACGCTAATCAAATTAGGTCTATGAATAATAATAACCAGCAGTTATTAATACCTAATTT 
GTTTGGCCCACAAGCCCCACCACGTGAAATGGTCGATTCAGTTAGGGCTGCGATTGCGAT 
GGATCCGAACTTCACGGCGGCACTTGCGGCCGCGATCTCAAACATTATCGGAGGAGGTAA 
TAACGACAACAATAATAATACTGATATTAATGATAACAAGGTTGATGCAAAAAGTGGAGG 
GAGTAGTAACGGAGATTCGCCACAGCTTCCTCAGTCTTGCACCACTTTCTCTACAAACTA 
ATTTTACTACCATTATTATATGTTATCTTATTATATATTACACACACATATTATACATTA 
TGCGTATCTTAAGTTTTTTTTTGGGGGCCATTATATATGAATGATATGGAGATCACTGAG 
AGAGAGAGAGAGCTATTATGGGTTTTTTTTT 

>G1417 Amino Acid Sequence (domain in AA coordinates: 239-296) 
MEEHIQDRREIAFLHSGEPLHGDSDSKDHQPNESPVERHHESSIKEVDFFAAKSQPFDLG 
HVRTTTIVGSSGFNDGLGLVNSCHGTSSNDGDDKTKTQISRLKLELERLHEENHKLKHLL 
DEVSESYNDLQRRVLLARQTQVEGLHHKQHEDVPQ^^ 

RRSPDDVDGRDMHRGSPKTPRIDQNKSTNHEEQQNPHDQLPYRiCARVSVRARSDAT^ 

GCQWRKYGQKMAKGNPCPRAYYRCTMAVGCPVRKQVQRCAEDTTILTTTYEGNHNHPLPP 

S ATAMAATTS AAAAMLLSGS S S SNLHQTLS S PS ATS S S S FYHNFPYTSTI ATL SASAPFP 

TITLDLTNPPRPLQPPPQFLSQYGPAAFLPNANQIRSMNNNNQQLLIPNLFGPQAPPREM 

VDSVRAAIAMDPNFTAALAAAISNIIGGGNNDNNNNTDINDNKA^ 

QSCTTFSTN* 

>G1442 (1..1293) 

ATGGGAACAAGAGCAGAACGCAAGGAAGATTTTGTTGGTGGGTTTGGATTTGGTGTTGTA 
GAAAATTCGCATAAAGACGTTATGGTGCTACCTCATCATCACTATTATCCATCATATTCA 
TCACCTTCCTCTTCTTCTTTGTGTTACTGTTCTGCTGGTGTTAGCGATCCCATGTTCTCT 
GTTTCTAGCAATCAGGCTTACACTTCTTCTCACAGTGGTATGTTCACACCCGCCGGTTCT 
GGTTCTGCTGCTGTGACTGTAGCAGATCCTTTTTTCTCCTTGAGCTCTTCAGGGGAAATG 
AGAAGAAGTATGAACGAAGATGCTGGTGCAGCTTTCAGCGAAGCTCAATGGCATGAGCTT 
GAGAGGCAGAGGAA^ATATACAAGTACATGATGGCTTCTGTTCCTGTTCCTCCAGAGCTT 
CTCACACCCTTTCCCAAGAACCACGAATCAAACACTAACCCGGATGTAACTGTGGCAGTG 
GCGACAGGAGGCTCATTGCAGCTGGGGATTGCTTCAAGCGCAAGCAATAACACGGCTGAT 
CTGGAGCCATGGAGGTGCAAGAGAACAGATGGGAAGAAATGGAGGTGCTCTAGAAACGTG 
ATTCCTGATCAGAAATACTGTGAGAGACACACACACAAGAGCCGTCCTCGTTCAAGAAAG 
CATGTGGAATCATCTCACCAATCATCTCACCACAATGACATTCGTACGGCTAAGAATGAT 
ACTAGCCAGCTTGTGAGAACTTATCCTCAGTTTTACGGACAACCTATAAGCCAGATCCCT 
GTGCTTTCTACTCTTCCGTCTGCCTCCTCTCCATATGATCACCACAGAGGACTGAGGTGG 
TTTACGAAAGAAGATGATGCCATTGGAACCTTAAACCCGGAGACTCAAGAAGCTGTCCAG 
CTGAAAGTTGGATCAAGCAGAGAGCTCAAACGGGGATTCGATTATGATCTGAATTTCAGG 
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CAGAAAGAGCCAATAGTAGACCAGAGCTTTGGAGCATTGCAGGGTCTATTAAGTCTAAAC 
CAGACACCACAACATAACCAAGAAACAAGACAGTTTGTTGTAGAAGGAAAGCAAGATGAA 
GCGATGGGAAGCTCTCTGACACTCTCAATGGCTGGAGGAGGCATGGAGGAAACAGAGGGA 
ACAAACCAGC^TCAGTGGGTTAGCCATGAAGGTCCATCATGGCTCTATTCAACAACACCA 
GGTGGACCATTGGCTGAAGCACTGTGTCTCGGTGTCTCCAACAACCCAAGTTCTAGTACT 
ACTACTAGTAGCTGCAGCAGAAGCTCAAGCTAA 

>G1442 Amino Acid Sequence (domain in AA coordinates: 172-223) 

MGTRAERKEDFVGGFGFGVVENSHKDVMVLPHHHYYPSYSSPSSSSLCYCSAGVSDPMFS 

VSSNQAYTSSHSGMFTPAGSGSAAVTVADPFFSLSSSGEMRRSMNEDAGAAFSEAQWHEL 

BRQRNIYKYMMASVPVPPELLTPFPKOTQSNTNPDVTVAVATGGSLQLGIASSA 

LEPmCKRTDGKKWRCSRNVIPDQKYCBRHTH^ 

TSQLVRTYPQFYGQPISQIPVLSTLPSASSPYDHHRGLRWFTKEDDAIGTLNPETQEAVQ 
LKVGSSRELKRGFDYDLNFRQKEPIVDQSFGALQGLLSLNQTPQHNQETRQFVVEGKQDE 
AMGSSLTLSMAGG(^EETEGTNQHQWVSHEGPSWLYSTTPGGPLAEALCLGVSNNPSSST 

TTSSCSRSSS* 
>G1454 (86.. 1180) 

CTAGTAGTGATGATATGATCGCTTCTTCTCCTACT^TCTCAGAAACCTCCGATCACGGTT 

TTAGATATCTTCTACAACGGATACAATGGAGAGCACCGATTCTTCCGGTGGTCCACCACC 

GCCACAACCTAACCTTCCTCCAGGCTTCCGGTTTC^CCCTACCGACGAAGAGCTTGTTGT 

TCACTACCTCAAACGCAAAGC^GCCTCTGCTCCTTTACCTGTCGCCATCATCGCCGAAGT 

CGATCTCTATAAATTTGATCCATGGGAACTTCCCGCTAAAGCATCGTTTGGAGAACAAGA 

ATGGTACTTCTTTAGTCCACGAGATCGGAAGTATCCAAACGGAGCAAGACCAAACAGAGC 

GGCGACTTCAGGTTATTGGAAAGCGACCGGTACAGATAAACCGGTACTTGCTTCCGACGG 

TAACCAAAAGGTGGGCGTGAAGAAGGCACTAGTCTTCTACAGTGGTAAACCACCAAAAGG 

CGTTAAAAGTGATTGGATCATGCATGAGTATCGTCTCATCGAAAACAAACCAAACAATCG 

ACCTCCTGGCTGTGATTTCGGCAACAAAAAAAACTCACTCAGACTTGATGATTGGGTG^ 

ATGTAGAATCTACAAGAAGAACAACGCAAGTCGACATGTTGATAACGATAAGGATCATGA 

TATGATCGATTACATTTTCAGGAAGATTCCTCCGTCTTTATCAATGGCGGCTGCTTCTAC 

AGGACTTCACCAACATCATCATAATGTCTCAAGATCAATGAATTTCTTCCCTGGCAAATT 

CTCCGGTGGTGGTTACGGGATTTTCTCTGACGGTGGTAACACGAGTATATACGACGGCGG 

TGGCATGATCAACAATATTGGTACTGACTCAGTAGATCACGACAATAACGCTGACGTCGT 

TGGTTTAAATCATGCTTCGTCGTCAGGTCCTATGATGATGGCGAATTTGAAACGAACTCT 

CCCGGTGCCGTATTGGCCTGTAGCAGATGAGGAGCAAGATGCATCTCCGAGCAAACGGTT 

TCACGGTGTAGGAGGAGGAGGAGGAGATTGTTCGAACATGTCTTCCTCCATGATGGAAGA 

GACTCCACCATTGATGCAACAACAAGGTGGTGTGTTAGGAGATGGATTATTCAGAACGAC 

ATCGTACCAATTACCCGGTTTAAATTGGTACTCTTCTTAATCAAATGTGTTTCGCCGCCG 

GTGTGAAGAATTTTCCGGTGACAGTGAAGATTTTTTTCCGATTGGTGGGGTCATTTGCAT 

GC^TTATATAATTTGAGATTTGTGTATATGTTTTGGGTTAATTAATTGGTCACAGGGGC 

>G1454 Amino Acid Sequence (conserved domain in AA coordinates : 9-178) 

MESTDSSGGPPPPQPNLPPGFRFHPTDEELVVHYLKRKAASAPLPVAI 

ELPAKASFGEQEWYFF S PRDRKYPNGARPNRAATSGYWKATGTDKPVLASDGNQKVGVKK 

ALVFYSGKPPKGVKSDWIMHEYRLIENKPNNRPPGCDFGNKKNS 

ASRHVDNDKDHDMIDYIFRKIPPSLSMAAASTGLHQHHHNVSRSMNFFPGKFSG^ 

SDGGNTSIYDGGGMIlvmiGTDSVDHDNNAD^ 

DEEQDASPSKRFHGVGGGGGDCSl^SSSMMEETPPLMQQQGGVLGDGLFRTTSYQLPGLN 
WYSS* 

>G1459 (1..1272) 

ATGATGAAAGGTCTGATTGGGTATAGATTTAGTCCGACGGGAGAGGAAGTGATCAACCAT 
TACCTAAAGAACAAACTTCTGGGTAAGTATTGGCTCGTTGATGAAGCTATTAGCGAGATC 
AACATCTTGAGTCACAAACCC^GCAAGGATTTGCCTAAGTTAGCTAGGATCCAATCGGAA 
GATCTTGAATGGTATTTCTTCTCTCCGATTGAGTACACGAACCCGAATAAGATGAAAATG 
AAGAGGACGAC^GGTTCTGGGTTTTGGAAACCTACTGGTGTTGATCGGGAAATTAGGGAT 
AAAAGAGGAAATGGTGTTGTGATAGGGATTAAGAAGACGCTTGTGTACCATGAAGGTAAG 
AGTCCTCATGGAGTTAGAACTCCTTGGGTTATGCACGAGTATCACATCACTTGCTTGCCT 
CATCATAAGAGGAAATATGTTGTCTGCCAAGTAAAGTATAAGGGTGAAGCTGCAGAAATT 
TCATATGAGCCAAGTCCCTCTTTGGTATCCGATTCGCATACCGTCATAGCGATTACCGGA 
GAACCGGAACCTGAGCTTCAGGTTGAGCAGCCAGGTAAAGAAAATCTCTTGGGTATGTCT 



78 



WO 03/013227 



79/286 



PCT/US02/25805 



GTAGATGATTTGATAGAACCAATGAACCAACAAGAGGAGCCACZAAGGTCCTCACTTAGCT 

CCGAATGATGATGAGTTTATACGTGGATTGAGGCATGTTGATCGAGGGACGGTTGAATAT 

TTGTTTGCCAATGAAGAAAACATGGATGGTTTGTCTATGAATGACTTGAGAATC 

ATCGTCCAACAAGAGGATCTCTCTGAGTGGGAGGGATTTAACGCAGACACCTTTTTCAGC 

GACAACAACAATAACTATTyvCCTTAACGTGCATCATCAACTAACGCCTTACGGCGATGGC 

TATTTGAATGCATTTTCGGGTTATAACGAAGGGAATCCTCCCGATCACGAATTAGTGATG 

CAAGAGAACCGCAACGATCACATGCCAAGGAAACCTGTGACAGGGACCATTGATTATAGC 

AGCGATAGTGGCAGTGATGCTGGATCCATATCTACAACGGTGAAACAAGAAATCCCAAGA 

GCTGTTGATGCACCCATGAACAATGAGTCATCTTTGGTGAAAACAGAGAAGAAAGGCTTG 

TTTATTGTAGAGGACGCAATGGAGAGAAACCGCAAGAAACCACGATTTATCTATCTCATG 

AAGATGATCATAGGCAACATCATATCGGTTTTACTACCCGTCAAAAGATTGATCCCGGTG 

AAGAAGTTATGA 

>G1459 Amino Acid Sequence (conserved domain in AA coordinates : 10-152) 

MMKGLIGYRFSPTGEBVINHYLKNKLLGKY^ 

DLEWYFFSPIEYTNPNKMKMKRTTGSGFWKPTGVDREIRD 

SPHGVRTPWVI^YHITCLPHHKRKYWCQVTCYKGEAAEISYEPSPSLVS 

EPEPELQVTSQPGKENLLGMSVBDLIEPMNQQEEPQGPHLAPK^ 

LFANEENMDGLSMNDLRIPMIVQQEDLSEWEGFNADTO 

YLNAFS GYl^GNPPDHELVMQENRNDHMPRKPVTGTIDYS SDSGSDAGS I STTVKQE I PR 
AVDAP^Il!^ESSL V1CTEKKGLFIVF!DAMERNRKKPRFI YLMKMI IGNI ISVLLPVKRLIPV 
KKL* 

>G1460 (87.. 995) 

CGTCGACCTTCACTCAT^CCCTAATCCCGGGAACCCGGGAATTTTGATCATTTTGTTTCT 
TTTCGATCTGTTTCTATTTTAAAAAGATGATGAAAGATCCGACTGGGTATAGATTTAGTC 
CGACGGGAGAGGAAGTGATAAACCATTACCTAAAGAACAAAATTCTGGGTAAGACTTGGC 
TCGTTGATGAAGCCATTAG CGAGATCAACATCTTGAATCACAAAC CCAGCAAGGATTTGC 
CTAAGTTAGCTAGGATCCAATCGGAAGATCTTGAGTGGTACTTTTTCTCTCCGATTGAGT 
ACACGAACCCGAATAAGATGAAAATGAAGAGGACGACAGGTTCTGGGTTTTGGAAACCTA 
GTGGTGTTGATCGGAAAATTAGGGATAAAAGAGGAAATGGTGTTGTGATAGGGATTAAGA 
AGACGCTTGTGTACCATGAAGGTAAGAGTCCTCATGGAGTTAGAACTCCTTGGGTTATGC 
ACGAGTATCACATCACTTGCTTGCCTCATCATAAGAGGAAATATGTTGTCTGCCAAGTAA 
AGTATAAGGGTGAAGCTGCAGAAATTTCATATGAGCCAAGTCCCTCTTTGGTATCCGATT 
CGCATACCGTCATAGCGATTAACGGAGAACCGGAACCTGAGCTTCAGGTTGAGCAGCCAG 
GTAAAGAAAATCTCTTGGGTATGTC TGTAGATGATTTGATAGAAC C AATG AACCAACAAG 
AGGAGCCACAAGGTCCTCACTTAGCTCCGAATGATGATGAGTTTATACGTGGATTGAGAC 
ATGTTGATCGAGAGCCGGTTGAATATTTGTTTGCCAATGAAGAAAACATGGATGGTTTGT 
CTATTATGAATGACTTGACAATCCCAATGATCGCCCAACAAGAGGATCTCATTCTCTCTG 
AGTGGGAGGGATTTATCGCAGCCACCTTTTTCAGCGACAACAACAATAACAATAACCTTA 
ACGTGCATCAACTAACGTCTTTCTTACCGGGATGATTATCAGAATGCATTTTGGGTTACA 
ACGGAGCGNCCGCT 

>G1460 Amino Acid Sequence (domain in AA coordinates: TBD) 
MMKDPTGYRFSPTGEEVINHYLKNKS^ 
DLEWYFFSPIEYTNPNKMKMKRTTGSGFWKPSGVT5RKIRD 
SPHGVTfcTPWVWHBYHITCLPHHKRKYWC^ 

EPEPELQVEQPGKENLLGMSVDDLIEPMNQQEEPQGPHLAPNDDEFIRGLRHVDREPVEY 

LFANEENMDGLSIMNDLTIPMIAQQEDLILSEWEGFIAATFFSDNNNNNN^ 

PG* 

>G147 (37. .672* 

AAATCATCAGATAGAAGGAAATATTCTGATTGAGAGATGGCTCGTGGAAAGATTCAGCTT 
AAGAGGATTGAGAACeCGGTTCA(^GAGAAGTGACTTTTTGCAAGAGGAGAACTGGTCTT 
CTCAAGAAGGCTAAGGAGCTCTCTGTGCTCTGTGATGCCGAGATCGGTGTTGTGATCTTC 
TCTCCTCAGGGCAAGCTCTTTGAGCTCGCTACTAAAGGAACAATGGAGGGAATGATTGAT 
AAGTACATGAAGTGTACTGGTGGTGGTCGTGGTTCTTCTTCTGCTACTTTTACTGCTCAA 
GAACAACTTCAACCACCAAATCTTGATCCGAAAGATGAGATCAACGTGCTITAAGCAAGAG 
ATTGAGATGCTTCAGAAAGGGATAAGCTATATGTTTGGAGGAGGAGATGGGGCTATGAAT 
CTTGAAGAACTTCTTTTGCTTGAGAAGCATCTTGAGTATTGGATTTCTCAGATTCGCTCT 
GCTAAGATGGATGTTATGCTTCAAGAAATTCAGTCATTGAGGAACAAGGAAGGAGTCCTC 
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AAAAA(^CCAACAAGTATCTCCTCGACAAGATAGAGGAAAACAACAATAGCATATTAGAT 

GCTAACTTCGCAGTCATGGAGACAAACTATTCCTATCCGCTAACAATGCCAAGTGAAATA 

TTTCAGTTCTAGACCATAGGGTATTTGAAGACTATGTCTCACGAATTTAAATAACCTTGG 

TAAGTATAATATAGTGTTGTTAAATCACACATAATTAAAATAAAGCCTGTGGAACTTCGC 

TAGGCAGTTGAAAATCTATCCGTATGTTTTATCCTCTTGTTTTACATTTGTTGGTGTC 

GATGAAATGACTGCAAGTGTGGTGTGTACTTATAACTCTTTCTACTTTCTATCTATGTTO 

TGAATTTATGGATT 

>G147 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MARGKIQLKRIENPVHRQVTFCKRRTGLLKKAKELSVLCDAKIGWIFSPQGKLFEIiATK 

GTMEGMIDKYMKCTGGGRGS SSATFTAQEQLQPPNLDPKDEINVLKQE IEMLQKGI S YMF 

GGGDGAMNLEELLLLEKHLEYWISQIRSAKMD^ 

ENNNSILDANFAVMETNYSYPLTMPSEIFQF* 

>G1471 (1..735) 

ATGGAGAACCAATCTATGTCTTCATCAAGCTCCTCCACACACAAACATGATCAAAAACTC 

AAAAGTTCCGTTGTGGCCATGGAGGTCCTGGAGGAGAAGGAGACAGTGAACAATCCGCCC 

CAGTATTATAATAAGATCTACATCTGTTACTTGTGCAAGAGAGCGTTCCCAACCCCTCAT 

GCCCTTGGCGGTCACGGAACCACCCACAAGGAGGACCGAGAATTGGAGAGGCAACAGATC 

GAGTCAAGGCTTTCTAACAAAGACAAGTCTAACTTGCTCTTTGGTGGGTCTTCACAAGAT 

GTTTTATCAAATGATAATCACCTTGGACTCTCTCT 

AGCAGGAGC^GCAACAACGTTAACCCATTC 

GATATGAACATGAACAACTATAGCTCACATGC^ 

CTTACTCTTGGTCC^TCTAAGTCCATAGGAGATAG(^C^TATCATTAATAAa^CACT 
AACTCATCCTTCGATGGGAATCTGATCATTCCCGTTCGTCCTCGTGTGTCTAGATACCAT 
TTTGTTGCTGGGAACCCCCTTGATTCAATCT 

CCTCATCTAAACATCAATCTXTCTCATGATTCGTTTTCTTTACAAGAGAATGGTTCGGGC 
TCTAGTCACTCATAA 

>G1471 Amino Acid Sequence (domain in AA coordinates: 49-70) 
MENQSMS S S S S STHKHDQKLKS S WAME VLEEKETVNNPPQYYNKI YI CYLCKRAFPTPH 
ALGGHGTTHKEDRELERQQIESRLSNKDKSNLLFGGSSQDVLSNDNHLGLSIiGPIiKSIEG 
SSSSNNWPLIJWGVPRGTTDMNMN^ 

NSSFDGNLI I PVRPRVSRYHF VAGNPLDS ISRNI PPS ITFPHLNTNLSHDSFSLQENGSG 
SSHS* 

>G1475 (1..645) 

ATGAAGAGAACACATTTGGCAAGTTTTAGTAACAGAGACAAAACCCAAGAAGAAGAAGGA 

GAAGACGGTAATGGTGACAACAGAGTCATCATGAATCACTACAAGAATTACGAAGCTGGG 

CTGATCCCATGGCCTCCCAAGAATTAC^CTTGCAGCTTCTGCAGGAGAGAGTTCAGATCT 

GCTCAAGCACTTGGAGGCCACATGAATGTTCATAGAAGAGACAGAGCAAAACTCAGGCAG 

ATCCCTTCTTGGCTCTTCGAACCTCACCACCACACACCTATTGCAAACCCTAACCCTAAT 

TTTAGCTCTTCTTCTTCCTCTTCAACAACAACAGCTCATCTTGAGCCTTCC^ 

CAGAGATCCT^AAACCACTCCTTTTCCTTCTGCCCGGTTTGATCTTTTGGACAGTAC 

AGCTATGGAGGTTTGATGATGGACAGAGAGAAGAACAAGAGCAATGTATGTAGCAGAGAG 

ATCAAGAAAAGTGCCATCGATGCATGTCATTCAGTAAGATGTGAGATAAGCCGTGGGGAT 

CTGATGAATAAGT^AAGATGATCAAGTCATGGGGTTGGAGCTTGGGATGAGTTTGAGGAAT 

CCCAACCAAGTTCTTGATTTGGAGCTTCGACTAGGCTACCTCTAA 

>G1475 Amino Acid Sequence (domain in AA coordinates: 51-73) 

MKRTHIASFSNRDKTQEEEGEDGNGDl^VIM 

AQALGGHMNVHRRDRAKLRQIPSWLFEPHHHTPIANPNPNFSSSSSSSTTTAHLEPSLTN 
QRSKTTPFPSARFDMjDSTTSYGGLMMDREKNKSNVCSREIKKSAIDACHSVRCEISRGD 
LMNKKDDQVMGLELGMSLRNPNQVLDLELRLGYL* 
>G1477 (X. .606) * 

ATGTTGTCCTCGGACTCGAATTACGCTAGTGATATTAGCGACGATGCCTCCGCCACCGGA 

TCGATAGAGAATCCTATATACAAATGCAAGTATTGTCCTAGGAAGTTCGATAAAACACAA 

GCATTAGGTGGTCATC^AAATGCACACAGAAAGGAGAGAGAGGTCGAAAAACAACAAA^ 

GCATTTTTGGCGCATTTGAACCGACCAGAACCAGATCTTTACGCGTACTCGTATTC^ 

CATCATTCATTTCCTAACCAATACGCACTCCCACCGGGATTTGAACAGCCTCAGTACAAA 

GTTGATAGATCATACAAGATGTCCATGGTCTACAACCAATATGTGGGATCCTCAAGCTCT 

AGCTTTGCAGGACTACAAAGTGACCCAAGTCAAGGAATGAACCAGGATTGGACCTTTACC 
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GGGATCCCATTCCTACCCCAATCTCAACCTCAACCACTATCGTCACCAATATGTTTGGAT 
CTTTGCCTTGGCATTGGTAGCTCCCAAACCCAACCACAACCTCAAGAACCAAATGATGCA 
ACAGAAGAGATGGATGCTGAGAAAGAAAATGATGGTTCTTCCCITTCTCTCT(^CTCAM 

CTGTGA 

>G1477 Amino Acid Sequence (domain in AA coordinates: 29-48) 

MLSSDSNYASDISDDASATGSIENPIYKCKYCPRKFDKTQALGGHQNAHRKEREVEKQQK 

AFLAHLNRPEPDLYAYSYSYHHSFPNQYALPPGFEQPQYKVDRSYKMSMVYNQYVGSSSS 

SFAGLQSDPSQGMNQDWTFTGIPFLPQSQPQPLSSPICLDLCLGIGSSQTQPQPQEPNDA 

TEEMDAEKENDGSSLSLSLKL* 

>G1487 (1..1020) 

ATGGAACAAGCCGCGTTGAAGAGCAGCGTCAGGAAAGAGATGGCTCTCAAAACGACTTCT 
CCGGTTTACGAAGAGTTTCTTGCCGTCACCACCGCTCAAAATGGCTTTTCCGTCGACGAT 
TTCTCTGTAGACGACTTGCTTGACTTGTCAAACGATGACGTTTTTGCCGACGAAGAAACT 
GACCTCAAGGCTCAACATGAGATGGTCCGTGTTTCCTCTGAGGAACCCAACGACGACGGA 
GAOTCTCTTCGCCGGAGCAGCGATTTCTCCGGCTGTGACGACTTTGGTTCTCTCCCTACA 
AGCGAACTCTCTCTTCCGGCGGATGATTTAGCGAACCTTGAGTGGCTCTCTCATTTCGTG 
GAGGACTCCTTCACGGAATATTCGGGTCCAAACCTCACCGGAACCCCGACTGAGAAACCG 
GCGTGGTTAACGGGTGACCGGAAACATCCTGTGACTGCAGTCACGGAAGAGACCTGTTTC 
AAATCCCCTGTTCCGGCTAAAGCCCGTAGCAAACGTAACCGCAATGGCCTCAAGGTCTGG 
TCGCTTGGTTCGTCGTCCTCCTCGGGTCCTTCCTCGTCCGGTTCGACCTCCTCCTCCTCT 
TCGGGTCCTTCCAGCCCGTGGTTCTCCGGCGCTGAGCTGCTCGAGCCTGTGGTCACGTCA 
GAGAGGCCACCGTTTCCCAAGAAGCATAAGAAAAGGTCAGCCGAGTCTGTTTTCTCCGGT 
GAGCTGC^GC^GCTGC^CCTC^GCGAAAGTGCAGCCACTGCGGCGTTCAGAAAACTCCG 
CAGTGGAGAGCCGGGCCAATGGGAGCCAAGACCCTGTGCAATGCGTGCGGTGTCCGGTAC 
AAGTCGGGTAGGTTGCTACCGGAATACAGACCCGCTTGTAGCCCGACATTCTCGAGTGAG 
CTGCACTCGAACCACCACCGGAAAGTCATAGAGATGAGGCGGAAGAAGGAGCCAACCAGT 
GAGAACGAAACCGGTTTAAACCAGCTGKjTTG^^ 

>G1487 Amino Acid Sequence (domain in AA coordinates : 251-276), 
MEQAALKS S VRKEMALKTTS PVYEE FLAVTTAQNGFS VDDF S VDDLLDLSNDDVFADEET 
DLKAQHEMVRVS SEEPNDDGDALRRS SDFSGCDDFGS LPTS ELSLPADDLANLEWLSHFV 
EDSFTEYSGPITCjTGTPTEKPAWLTGDRKHPVTAWEETC 

SLGSSSSSGPSSSGSTSSSSSGPSSPWFSGAELLEPWTSERPPFPKKHKKRSAESVFSG 
ELQQLQPQRKCSHCGVQKTPQWRAGPMGAKTLCNACGVRYKSGRLLPEYRPACSPTFSSE 
LHSNHHRKVIEMRRKKEPTSDNETGLNQLVQSPQAVPSF* 
>G1492 (149.. 919) 

AATCCCAACCC^CAC^CCTCTCAAATCCTCCTCTCCTCGTTTCTCTTTCTCTCCTCTTCA 
CAGAACCAAAACATATCAAACCTTTTTTTCTCTTGGGTTTAAGTAAAAATCGAATCTTTG 
TGTCGGTTTTTAGGGTTCTTGAAACGATATGGGTAAGTCTAGTGGTAGAAATGGTAACGG 
AAGCTTTAACGGCAATAAATTTCACGGAGTTAGACCTTACG 

GCTTAGATGGACGCCGGATCTTCACCGTTGTTTCGTTCACGCCGTCGAGATTCTCGGTGG 
TCAACACCGAGCAACACCAAAACTTGTTCTTAAGATGATGGATGTGAAGGGACTTACCAT 
TTCACATGTCAAAAGCCACCTTCAGATGTATAGAGGAGGTTCAAAGCTCACTTTGGAGAA 
ACCAGAAGAAAGCTCATCATCTTCAATAAGAAGAAGACAAGACAGTGAAGAAGATTATTA 
TCTCC^TGAC^CTTGTCTTTACA^ 

TCCTCTTTCTTCACATTCTTCATTTAGAGGAGGAGGAGGAGGAAGAACAAAAGAGCAGCA 
GACTTCAGAGTCTGGTGGTTATGATGATGATGCTGACTTTCTTCACATCAAGAAGATGAA 
CGATACGACGACGTTTTTGTCACATCATTTCCCCAAGGGAACAGAGGAGTGGCGGGAACA 
AGAACACGAAGAAGAAGAAGAAGATTTGTCG1TGTOTCTGTCGTTAAATCATCATCATTG 
GAGAAGCAATGGATCATCGGTGGTGAGCGAAACGAGTGAAGCAGCAGTCTCGACTTGTTC 
AGCACCATTCGTATCG!AAAGATTGCTTTGGTTC 

TTCTCTCCTCGGTAGCTAAATAAGTTATGCAAGATTTAGGTTCAGAGAAACTATTCGGAT 

GTGTTTTTGAAACTAGGATATTGAATGTTAGTAGAGAAACCTAGAAAATGAAGTTTAGAT 

AAATTATCAACGCAGCGTTTTGATCGCCTTTGAACGGAAAATTAACAAA 

>G1492 Amino Acid Sequence (dpmain in AA coordinates: 34-83) 

MGKS S GRNGNGS FNGNKFHGWPYVRS PVPRLRWTPDLHRCFVHAVE ILGGQHRATPKLV 

LKMMDVKGLTISHVKSHLQMYRGGSKLTLEKPEESSSSSIRRRQDSEEDYYLHDNLSLHT 

RM)CLLGFHSFPLSSHSSFRGGGGGRTKEQQTSESGGYDDDADFLHIKKMNDTTTFLSHH 
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FPKGTEEWREQEHEEEEEDLSLSLSLNHHHWRSNGSSVVSETSEAAVSTCSAPFVSKDCF 

GSSKIDLNLSISLLGS* 

>G1531 (1..666) 

ATGTGTGAGTCAAGCAACAAAGTCAGAGTATCGCCATACCCGCTTCGGTCTTCGAGGACC 
GACAAACACAAGGCGTCAGAGTCGCCTATTGAGACAGGTTGGGAGGATGTGCGTGGATGT 
CATCCTTACATGTGCGATACGAGTGTTCGTCACTCCAATTGTTTCAAGCAGTTCCGCAGA 
AAAACCATAAAAAAGCGCCTATACCCCAAGACCTTACATTGTCCTCTCTGTAGAGGTGAA 
GTATCCGAGACGACAAAGGTGACGAGCACTGCAAGAAGATTTATGAATGCTAAACCGAGG 
TCTTGCTCCGTAGAGGATTGCAAATTCTCTGGGACGTTTT^ 

AAAACTGAGCATCGCGGTATTGTGCCACCAAAGGTCGATCCACTGAGACAACAGAGATGG 

GAAATGATGGAGAGACATTCTGAATACGTTGAACTCATGACTGCAGCTGGGATTTCGCGT 

ATGGCTGAGGTGATGC^C^C^GCTTCCCC^GGATCAGAATC^TCCTC^TGTGTTTCAA 

GTGACCGTTT^ATGGAACCATATGGAATCTAATTGATCCGAGTCAGGGAAGGAATGGATTA 

GGCATCACCAACTATAGCGCAATGCAGTTTGTACCATTAAGCATAAATCACA 

CTGTGA 

>G1531 Amino Acid Sequence (domain in AA coordinates: 41-77) 
MCESSNKVRVSPYPLRSSRTDKHKASESPIETGWEDW 

KTIKKRLYPKTLHCPLCRGEVSETTKVTSTARRFMNAKPRSCSVEDCKFSGTFS 
KTEHRGrVPPKVDPLRQQRWEMMERHSEYVEL^ 
VTVNGTIWNLIDPSQGRNGLGITNYSAMQFVPLSINHSRTL* 
>G1540 (122.. 997) 

atctctttactaccagcaagttgttttcttgctaacttcaaacttctctttctcttgttc 
ctctctaagtcttgatcttatttaccgttaactttgtgaacaaaagtcgaatcaaacaca 
catggagccgccacagcatcagcatcatcatcatcaagccgaccaagaaagcggcaacaa 
caacaacaagtccggctctggtggttacacgtgtcgccagaccagcacgaggtggacacc 
gacgacggagcaaatcaaaatcctcaaagaactttactacaacaatgcaatccggtcacc 
aacagccgatcagatccagaagatcactgcaaggctgagacagttcggaaagattgaggg 
caagaacgtcttttactggttccagaaccataaggctcgtgagcgtcagaagaagagatt 
caacggaacaaacatgaccacaccatcttcatcacccaactcggttatgatggcggctaa 
cgatcattatcatcctctacttcaccatcatcacggtgttcccatgcagagacctgctaa 
ttccgtcaacgttaaacttaaccaagaccatcatctctatcatcataacaagccatatcc 
cagcttcaataacgggaatttaaatcatgcaagctcaggtactgaatgtggtgttgttaa 
tgcttctaatggctacatgagtagccatgtctatggatctatggaacaagactgttctat 
gaattacaacaacgtaggtggaggatgggcaaacatggatcatcattactcatctgcacc 
ttacaacttcttcgatagagcaaagcctctgtttggtctagaaggtcatcaagacgaaga 
agaatgtggtggcgatgcttatctggaacatcgacgtacgcttcctctcttccctatgca 
cggtgaagatcacatcaacggtggtagtggtgccatctggaagtatggccaatcggaagt 
tcgcccttgcgcttctcttgagctacgtctgaactagctcttacgccggtgtcgctcggg 
attaaagctctttcctctctctctctctttcgtactcgtatgttcacaactatgcttcgc 
tagtgattaatgatgcagttgttatattagtagttaactagttatctctcgttatgtgta 
atttgtaattactagctaagtatcgtctaggtttaattgtaattgacaaccgtttatctc 
tatgatgaataagttaaatttatatat 

>G1540 Amino Acid Sequence (domain in AA coordinates: 35-98) 
MEPPQHQHHHHQADQESGNNNNKSGSGGYTCRQTSTRWTPTTEQIKILKELYYNNAIRSP 
TADQIQKI TARLRQFGKI EGKNVPYWFQNHKARERQKKRFNGTNMTTPS S S PNS VMMAAET 
DHYHPLLHHHHGVPMQRPANSVNVKIiNQDHHL 

ASNGYMSSHVYGSMEQDCSMNYNNVGGGWANMDHHYSSAPYNFFDRAICPLFGL^ 
ECGGDAYLEHRRTLPLFPMHGEDHINGGSGAIWKYGQSEVRPCASLELRLN* 
>G1544 (1..2178) 

ATGTCTCAGTCAAACATGGTACCAGTGGCTAACAACGGAGACAACAACAACGACAACGAA 

AACAACAACAACAACAACAACAATGGTGGAACTGACAACACTAATGCTGGAAATGATTCT 

GGAGATCAAGATTTCGACAGTGGGAATACCTCAAGTGGCAATCATGGAGAAGGGTTGGGA 

AACAATCAAGCTCCTCGTCATAAGAAGAAAAAATACAATCGTCACACCCAACTTCAGATT 

TCGGAGATGGAAGCTTTCTTCAGAGAGTGTCCTCACCCAGATGACAAACAAAGGTACGAC 

CTTAGCGCTCAATTGGGATTGGACCCTGTTCAGATCAAATTCTGGTTCC^GAACAAA 

ACTCAAAACAAGAATCAACAAGAACGCTTTGAGAACTCAGAACTTCGGAATCTGAACAAC 

CACCTTAGGTCTGAAAATCAGCGGTTACGAGAAGCTATTCATCAAGCCTTATGCCCTAAG 
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TGTGGAGGCCA7VACTGCAATTGGCGAAATGACCTTCGAAGAGCACCATCTTCGCATCCTC 
AACGCTTOTTTGACTGAAGAGATCAAGCAACnTTCCGTGACAGCGGAAAAGATATCAAGG 
CTTACGGGGATACCAGTAAGGAGCCATCCCCGTGTGTCTCCTCCTAATCCTCCTCCAAAT 
TTCGAGTTCGGGATGGGATCTAAGGGAAATGTCGGAAACCACTCGAGGGAAACCACTGGA 
CCTGCAGATGCTAATACCAAGCCGATCATCATGGAGTTGGCATTTGGAGCCATGGAGGAG 
C TCTTGGTG ATGGCTCAAGTGGC TGAAC CACTGTGG ATGGG AGGATTTAATGGCACTAG C 
TTAGCTTTGAACTTGGATGAATACGAAAAGACGTTTCGCACGGGTCTCGGTCCTAGACTT 
GGCGGGTTTCGAACCGAGGCATCCAGGGAAACTGCACTCGTGGCAATGTGTCCTACTGGC 
ATTGTTGAAATGCTCATGCAAGAGAATCTGTGGTCAACAATGTTTGCCGGAATTGTTGGT 
AGAGCCAGGACTCATGAACAGATAATGGCTGATGCTGCTGGAAACTTCAATGGAAATCTC 
CAAATAATGAGTGCTGAGTACCAAGTGCTTTCCCCGCTAGTCACAACCCGCGAAAGCTAC 
TTCGTCCGCTACTGTAAGCAACAAGGAGAGGGTTTGTGGGCGGTGGTCGATATTTCCATC 
GACCATCTCCTCCCAAACATCAACCTAAAATGTCGCCGCCGACCCTCTGGATGTCTGATT 
CAAGAAATGCATAGTGGTTACTCCAAGGTTACATGGGTGGAACATGTGGAAGTAGATGAT 
GCAGGAAGTTACAGCATCTTTGAGAAATTAATCTGTACTGGTCAAGCTTTTGCTGCTAAC 
CGCTGGGTTGGTACATTGGTACGCCAGTGTGAGCGGATATCTAGCATCTTGTCGACAGAT 
TTTCAATCTGTCGATTCCGGTGATCACATAACGCTAACTAACCATGGAAAGATGAGCATG 
CTGAAGATAGCTGAGCGGATTGCGAGAACCTTCTTTGCTGGAATGACCAATGCGACGGGG 
TCTACAATATTTTCTGGTGTTGAAGGAGAAGATATCAGAGTGATGACAATGAAGAGCGTG 
AATGATCCAGGAAAGCCTCCCGGTGTC^TTATTO^ 

GCTCCTCCTAACACTGTOTTTGACTTCCTCAGAGAGGCTACTCACCGACACAATTGGGAT 
GTTCTCTGCAACGGAGAGATGATGCACAAGATAGCAGAGATTACGAATGGGATAGACAAA 
AGGAACTGTGCAAGTTTACTCCGGCATGGACACACTAGCAAGAGCAAGATGATGATAGTT 
CAAGAGACH^CTACTGACCC^t^GCTTCATTTGTGCTTTATGCGCCTGTTGATATGACA 
TCAATGGATATTACTCTCCATGGAGGTGGTGATCCTGACTTTGTGGTGATCCTGCCTTCT 
GGTTTTGCTATTTTTCCAGATGGTACGGGTAAGCCTGGAGGAAAAGAAGGAGGATCACTT 
TTGACCATTTCCTTCCAAATGCTGGTTGAGTCAGGTCCTGAGGCTAGGCTGAGTGTTAGC 
TCTGTTGCAACTACTGAGAATCTGATTCGTACAACCGTGCGGAGGATCAAAGATTTGTTT 
CCTTGTCAGACTGCTTGA 

>G1544 Amino Acid Sequence (domain in AA coordinates: 64-124) 
MSQSNMVPVANNGDJ^NITONENNNNNNNNGG 

NNQAPRHKKKKYNRHTQLQISEMEAFFRECPHPDDKQRYDLSAQLGLDPVQIKFWFQl^ 
TQNKNQQERFENSELRNLNNHIiRSENQRLREAIHQALCPKCGGQTAIGEMTFEEHHLRIL 
NARLTEEIKQLSVTAEKISRLTGIPWSHPRVSPPNPPPNFEFGMGSKGNVGNHSRETTG 
PADANTKPIIMELAFGAMEELLVMAQVAEPL^ 

GGFRTEASRETALVAMCPTGIVEMLMQENLWSTMFAGIVGRARTHEQIMADAAGNFNGNL 
QIMSAEYQVLSPLVTTRESYFVRYCKQQGEGLWAVVDISIDHLLPNINLKCRRRPSGCLI 
QEMHSGYSKVTWEHVEVDDAGSYSIFEKLICTGQAFAANRWVGTLVRQCERISSILSTD 
FQSVDSGDHITLl^GKMSMLKIAERIARTFFAGMTNATGSTIFSGVEGEDIRVMTMKSV 
NDPGKP PGV 1 1 CAAT S FWL P APPNTVFDFLREATHRHNWDVLCNGEMMHKI AE I TNG IDK 
RNCASLLRHGHTSKSKMMIVQETSTDPTASFVLYAPVDMTSMDITLHGGGDPDFWILPS 
GFAIFPDGTGKPGGKEGGSLLTISFQMLVESGPEARLSVSSVATTENLIRTTVRRIKDLF 
PCQTA* 

>G156 (39.. 755) 

AGGAAGAGGGAGCCACTCATAAGAGGAAGAAGAGAGAGATGGGTAGAGGGAAGATAGAGA 
TAAAGAAGATAGAGAATCAGACGGCGAGGCAAGTGACCTTCTCCAAGAGAAGAACTGGTC 
TTATAAAGAAGACTCGTGAGCTCTCTATTCTCTGTGACGCTCACATCGGTCTCATCGTCT 
TCTCAGCCACCGGAAAGCTTTCCGAGTTCTGCTCCGAACAGAACAGGATGCCTCAACTCA 
TTGACCGATACTTGCATACCAACGGATTGCGACTTCCTGATCATCATGACGACCAGGAGC 
AATTGCACCATGAGATGGAACTACTAAGAAGAGAGACATGTAACCTTGAGCTTCGTCTGC 
GTCCATTCCATGGACATGACTTAGCCTCCATTCCTCCTAATGAGCTTGACGGACTCGAGA 
GACAGCTAGAACATTCTGTCCTCAAAGTCCGTGAGCGTAAGAGGAGGATGCTAGAAGAAG 
ATAACAACAACATGTACCGTTGGCTTCATGAGCATCGTGCAGCGATGGAGTTTCAACAAG 
CTGGGATAGATACCAAACCAGGGGAGTATCAAC7VGTTTATAGAGCAGCTTCAGTGCTATA 
AACCAGGGGAGTATCAGCAGTTTCTAGAGCAGCAGCAACAACAACCAAACAGCGTTCTTC 
AGCTTGCTACACTTCCTTCTGAGATTGATCCTACTTACAATCTCCAGCTTGCTCAGCCTA 
ATCTTCAAAACGATCCAACGGCCCAGAATGATTAATACAATTCTCAATAGATATCTACTC 
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TTTCTTTATGGAGACAGATTCATGAACTTTTATTACCT 
TCTTTTGTGTGGCTATGGAAACCTTGTTTAA^ 

TAATTAATCATCATTATTACATANWAAANAANNAAAAAAAAAAAAAA 

>G156 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRGKIEIKKIENQTARQVTFSKRRTGLIKKTRELSILCDAHIGLIVFSATGKLSEFCSE 

QNRMPQLIDRYLHTNGLRLPDHHDDQEQLHHEMELLRRETCmELRLRPFHGHDLASIPP 

NELDGLERQLEHSVLKVRERKRRMLEEDNN^^ 

JEQLQCTKPGEYQQFLEQQQQQPNSVLQLATLPSEIDPTYNLQLAQPNLQNDPTAQND* 
>G1584 (160.. 1281) 

ATTCAC^TTTTTATTTATCTTTCCATTTAGCCATTCTGT^ 

TTTTTGACACATCACATCATCATCACATCATCATTCAACATCAATCATCA 

ACACATACATCTGTGTTCTGCGGATCGAGTTAATTAGTTATGGCTTCTTCGAATAGACAC 

TGGCCAAGCATGTTCAAGTCCAAACCTCATCCCCATC^TGGCAA(^TGA(^TC^ACTCT 

CCTCTCTTGCCTTCTGCTTCTCACCGATCTTCTCCTTTCTCTTCAGGATGTGAAGTGGAG 

AGGAGTCCAGAGCCAAAACCAAGATGGAATCCAAAGCCAGAGCAGATTCGGATACTTGAA 

GCAATCTTTAACTCCGGGATGGTGAATCCTCCAAGAGAGGAGATCAGGCTTCAAGAATAC 

GGCCAAGTCGGTGATGCTAACGTCTTCTACTGGTTCCAAAACCGTAAGTCCCGTAGTAAA 

CACAAACTCCGCCTCCTCC^CAACCACTCC^^ 

CCGCAGCCGCAACCTTCGGCTTCCTCTTCCTCTTCCTCCTCCTCTTCCTCCTCCAAATCC 

ACCAAACCCCGAAAAAGCAAGAACAAGAAGAACACTAATCTCTCTT^ 

ATGATGGGGATGTTTCCACCGGAACCGGCGTTTCTCTTCCCGGTCTCCACTGTCGGAGGG 

TTTGAAGGTATCACCGTCTCATCCCAATTAGGGTTTCTCTCCGGTGATATGATTGAGCAA 

CAAAAACCGGCTCCAACGTGTACCGGACTCCTGCTGAGTGAGATCATGAACGGTAGTGTG 

AGTTATGGAACTC^TCATCAACAACACTTGAGTGAGAAAGAAGTTGAAGAAATGAGGATG 

AAGATGTTGCAACAGCCACAGACTCAGATTTGTTACGCTACCACTT^ATCATCAAATAGCT 

TCTTACAACAACAACAACAACAACAATAACATCATGCTTCATATTC CTCC CACTAC TTCT 

ACTGCCACCACTATTACTACTTCGCATTCTCTCGCTACTGTCCCATCAACTTCGGACCAG 

CTTCAAGTTCAAGCGGACGCACGAATAAGAGTTTTCAT 

AGCTCAGGACCGTTCAATGTGAGGGATGCATTTGGGGAAGAGGTTGTTCTGATTAATTCC 
GCGGGTCAGCCCATTGTCACCGATGAATATGGCGTCGCTCTTCACCCTCTTCAACACGGA 
GCCTCX3TACTATCTGATCTAGTCGTGTGGGAGATTTGAGTTTGAAGAAGAAATTAAGACC 
TGTCTCTTTCTTTCACCATCTCTCGTACGTAGGCTTAAATGTTAAGATTTTATAAAGTAT 
TGGTTTCAGTTACCTGTTGTGACGGTGTTTATGTATC 

CTCTCTCGTTAAATTGTTGACCAATAATATATGATGTGTGTTTCATTATTATCTAAAAAA 
AA 

>G1584 Amino Acid Sequence (domain in AA coordinates: TBD) 
MASSNRHWPSMFKSKPHPHQWQHDINSPLLPSASHRSSPPSSGCEVERSPEPKPRWNPKP 
EQIRILEAIFNSGMVNPPREEIRLQEYGQVGDANVFYWFQNRKSRSKHKLRLLHNHSKHS 
LPQTQPQPQPQPSASSSSSSSSSSSKSTKPRKSKNKNNTNLSL^ 

PVSTVGGFEGITVSSQLGFLSGDMIEQQKPAPTCTGLLLSEIMNGSVSYGTHHQQHLSEK 
EVEEMRMKMLQQPQTQICYATTNHQIAJSYNNNNNN^ 

VPSTSDQLQVQADARIRVFINEMELEVS SGP FNVRDAFGEEWL INS AGQPIVTDE YGVA 

LHPLQHGASYYLI* 

>G1587 (1..816) 

ATGGGCTACATCTCCAACAACAACCTCATCAACTATTTGCCCCTCTCTACTACTCAACCT 
CCTCTTCTTCTCACCCACTGTGATATTAACGGCAATGATCACCATCAGCTCATAACCGCA 
TC^TCAGGAGAACACGATATTGATGAACGGAAAAACAACATTCCTGCGGCGGCGACTTTG 
AGATGGAATCCGACGCCAGAGCAGATCACGACGCTAGAAGAGCTTTACAGAAGCGGAACA 
CGGACGCCGACGACGGAACAGATCCAACAGATAGCATCTAAGCTTCGTAAATATGGGAGA 
ATCGAAGGGAAGAACGTTTTCTATTGGTTTCAGAATCATAAGGCTAGAGAGAGACTAAAA 
CGCCGCCGTCGTGAAGGTGGTGCTATTATCAAACCACATAAAGACGTCAAGGATTCATCA 
TCAGGTGGTCATCGAGTTGATCAGACAAAGCTCTGCCCATCTTTTCCACACACAAACCGA 
C(^CAGCCACAGCATGAATTAGATCCTGCGAGTTA 

GAAGATCATGGGACGACTGAAGAATCTGATCAGAGGGCATCAGAGGTTGGTAAATACGCC 
ACATGGAGAAATCTTGTTACTTGGTCGATAACTCAACAACCGGAAGAGATTAATATCGAC 
GAAAATGTCAACGGAGAAGAAGAAGAAACGAGGGACAACCGGACTTTAAATCTCTTTCCG 
GTTAGGGAGTACCAAGAGAAAACAGGCCGGTTGATAGAGAAGACGAAAGCATGCAACTAC 
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TGTTACTACTACGAGTTCATGCCTCTGAAGAACTGA 

>G1587 Amino Acid Sequence (conserved domain in AA coordinates: 61-121) 

MGYISNNNLINYLPLSTTQPPLLLTHCDINGNDHHQLITASSGEHDIDERKNNIPAAATL 

RWNPTPEQITTLEELYRSGTRTPTTEQIQQIASKLRKYGRIEGKNVFYWFQNHKARERLK 

RRRREGGAIIKPHKDVKDSSSGGHRVDOTKLCPSFPHTNRPQPQHELDPASYNKDNWANN 

EDHGTTEESDQRASEVGKYATWRNLVTWSITQQPEEINIDENVNGEEEETRDNRTIiNIjFP 

VREYQEKTGRLIEKTKACNYCYYYEFMPL.KN* 

>G1588 (1..2232) 

ATGTACCATCCAAACATGTTTGAGAGCCATCATATGTTCGATATGACCCCAAAGAGTACC 

tctgataacgacttgggaatcaccggtagccgagaagatgactttgagaccaagtcaggt 

ACCGAAGTCACTACTGAGAATCCTTCTGGTGAAGAGCTTCAAGATCCTAGCCAACGTCCC 

AACAAAAAGAAGCGTTACCATCGCCACACGCAACGCCAAATTCAAGAGCTCGAATCATTC 

TTTAAGGAATGTCCTCATCCAGATGATAAGCAACGAAAAGAGTTGAGCCGTGATCTCAAT 

TTAGAGCCTCTTCAAGTTAAGTTTTGGTTCCAAAACAAACGCACACAGATGAAGGCACAA 

AGTGAGAGGCATGAGAACCAGATTCTAAAGTCAGACAATGACAAGCTCAGAGCAGAGAAC 

AATAGATACAAAGAAGCTCTAAGCAATGCTACATGCCCTAACTGTGGCGGTCCAGCTGCT 

ATTGGAGAAATGTCTTTTGACGAACAACATCTCAGGATCGAAAATGCTCGGCTCCGCGAA 

GAGATTGATAGGATCTCTGCTATTGCTGCGAAATACGTTGGGAAGCCGTTAGGATCGTCT 

TTCGCTCCACTAGCGATCCACGCGCCTTCTCGTTCGCTTGATCTTGAAGTTGGAAACTTT 

GGGAACCAGACAGGCTTTGTAGGAGAAATGTATGGAACAGGGGACATTTTGAGGTCAGTT 

TCX3ATTCCTTCTGAGACTGATAAGCCTATAATCGTGGAGCTAGCGGTTGCAGCTATGGAG 

GAACTCGTGAGAATGGCTCAAACTGGAGATCCTTTATGGCTTTCAACCGATAATTCAGTC 

GAGATTCTCAACGAAGAAGAGTATTTCAGAACGTTTCCGAGAGGAATTGGACCAAAGCCA 

TTAGGATTAAGATCAGAGGCGTCAAGACAATCTGCAGTTGTTATAATGAATCACATCAAT 

CTCGTTGAGATTCTCATGGATGTGAATCAATGGTCTTGTGTTTTCTCTGGGATTGTGTCA 

AGAGCCTTGACACTTGAAGTTCTTTCAACTGGAGTTGCTGGGAACTACAACGGTGCTTTA 

CAAGTGATGACAGCTGAGTTTCAAGTTCCATCACCCCTAGTCCCAACGCGTGAGAACTAC 

TTTGTGAGATACTGCAAACAACACAGTGACGGCTCTTGGGCTGTGGTTGATGTCTCTTTG 

GACAGCCTTAGACCAAGTACTCCAATCTTAAGAACTAGAAGAAGGCCTTCAGGTTGTCTG 

ATTCAAGAATTGCCTAATGGTTATTCTAAGGTTACATGGATAGAGCATATGGAGGTAGAT 

GATAGATCAGTTCACAACATGTATAAACCGTTGGTTCAGTCCGGTTTAGCTTTCGGTGCG 

AAACGTTGGGTGGCTACACTCGAACGACAATGCGAGCGGCTTGCTAGCTCCATGGCCAGC 

AACATTCCTGGTGATCTTTCCGTGATAACGAGTCCTGAAGGAAGGAAGAGTATGTTGAAG 

CTAGCTGAGAGAATGGTTATGAGTTTCTGCAGTGGTGTTGGCGCGTCGACTGCACACGCT 

TGGACAACAATGTCGACAACAGGATCCGATGATGTTCGGGTCATGACCCGCAAGAGTATG 

GATGATCCAGGAAGACCTCCGGGTATTGTTCTTAGTGCAGCTACTTCATTCTGGATCCCA 

GTTGCTCCCAAACGTGTTTTTGATTTCCTCCGTGACGAAAATTCAAGAAAAGAGTGGGAT 

ATTCTGTCAAATGGAGGTATGGTTCAGGAAATGGCTCATATAGCCAATGGTCATGAACCT 

GGAAACTGTGTCTCCTTGCTCCGAGTCAATAGTGGAAACTCGAGCCAGAGCAACATGTTG 

ATTCTACAAGAGAGCTGTACAGATGCATCAGGATCGTATGTGATTTACGCGCCAGTGGAT 

ATAGTGGCGATGAATGTGGTTCTAAGCGGTGGAGATCCTGATTACGTGGCGTTGTTGCCG 

TCTGGTTTTGCTATTTTACCGGATGGTTCGGTTGGAGGAGGAGATGGGAATCAGCATCAG 

GAAATGGTTTCTACTACTTCTTCTGGGAGTTGTGGTGGTTCGCTTTTAACCGTTGCGTTT 

CAGATTCTTGTTGACTCTGTTCCTACAGCTAAACTCTCACTTGGCTCGGTGGCTACGGTT 

AATAGTCTGATCAAATGTACGGTGGAGAGGATTAAAGCTGCTGTTTCTTGTGATGTTGGA 

GGAGGAGCGTAG 

>G1588 Amino Acid Sequence (domain in AA coordinates: 66-124; 

MYHPimFESHHMFIMTPKSTSDNDLGITGSREDDFETKSGTEVTTENPSGEELQDPSQRP 

NKKKRYHRHTQRQIQELESFFKECPHPDDKQRKELSRDIJILEPLQVKFWFQNKRTQMKAQ 

SERHENQILKSDNDKLRAENNRYKEALSNATCPNCGGPAAIGEMSFDEQHLRIENARLRE 

EIDRISAIAAKYVGKPLGSSFAPIAIHAPSRSLDLEVGNFGNQTGFVGEMYGTGDILRSV 

SIPSETDKPIIVELAVAAMEELVRMAQTGDPLWLSTDNSVEIIiNEEEYFRTFPRGIGPKP 

LGLRSEASRQSAWIMNHINLVEILMDWQWSCWSGIVSRALTLEVLSTGVAGNYNGAL 

QVMTAEFQVPSPLVPTRENYFVRYCKQHSDGSWAWDVSLDSLRPSTPILRTRRRPSGCI. 

IQELPNGYSKVTWIEHMEVDDRSVKIMYKPLVQSGLAFGAKRWVATLERQCERLASSMAS 

NIPGDLSVITSPEGRKSMLKLAERMVMSFCSGVGASTAHAWTTMSTTGSDDVRVMTRKSM 

DDPGRPPGIVLSAATSFWIPVAPKRVFDFLRDENSRKEWDILSNGGMVQEMAHIANGHEP 
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GNCVSLLRVNSGNSSQSNMLILQESCTDASGSYVIYAPVDIVAMNVVLSGGDPDYVALLP 
SGFAILPDGSVGGGDGNQHQEMVSTTSSGSCGGSLLTVAFQILVDSVPTAKLSLGSVATV 

NSLIKCTVERIKAAVSCDVGGGA* 
>G1589 (179.. 2221) 

ACCAAACTCACATAGCAATCACACACATCTCCACAAACACAGCTTGAGATGATCATGAAA 
CACGTGCATCCTCAGATCTCTATCAATCCAGCTTGGTGAAAGAAGGTCAAGAATTGAAAG 
AGAATCAAAGAAAACGACGTCGTTTCATTCGTGTGTAACAACTACTT^ATTATACATAGAT 
GGCTGCTTACTTTCACGGAAACCCACCGGAGATCTCTGCCGGATCCGACGGTGGTCTTCA 
AACGTTGATCCTCATGAATCCAACTACTTACGTTCAGTACACCCAACAAGACAACGACTC 
GAACAACAACAACAACAGCAACAATAGCAACAACAACA 

C^CAACAGTAGTTTCGTTTTCCTCGATTCCCACGCGCCGCAGCCAAACGCGAGCCAGCA 
GTTCGTCGGAATACCACTCTCAGGTCACGAAGCTGCTTCCATTACAGCCGCCGACAACAT 
CTCCGTACTTCACGGTTATCCTCCGCGCGTGCAGTACAGTCTCTACGGTAGCCACCAAGT 
GGATCCCACTCACCAGCAAGCCGCGTGTGAGACGCC^CGCGCGCAGCAAGGCCTCTCTTT 
AACCCTCTCGTCTCAACAGCAGCAGCAACAGCAACATCATC^ 

CGTCGGATTCGGGTCCGGACATGGAGAAGATATCCGGGTCGGGTCTGGCTCTACAGGATC 

GGGGGTAACAAACGGTATAGCTAATCTTGTTAGCTCCAAGTACTTGAAGGCAGCACAAGA 

G CTTC TTGACGAAGTAGTCAACGCTGATTC CGATGACATGAACGCTAAATCCCAACTATT 

CTCATCGAAAAAGGGTAGTTGCGGAAATGATAAACCTGTCGGAGAATCATCGGCCGGCGC 

TGGAGGAGAAGGTTCCGGTGGCGGAGCAGAAGCAGCCGGGAAACGTCCGGTGGAGCTAGG 

CACGGCAGAGAGACAAGAAATACAGATGAAGAAAGCAAT^ACTTAGTAACATGCTTCATGA 

GGTGGAGCAGAGATATAGACAGTACCACCAGCAGATGCAGATGGTGATCTCTTCGTTCGA 

GCAAGCGGCAGGGATAGGATCAGCGAAGTCATACACGTCGCTAGCATTGAAAACCATATC 

AAGACAGTTCCGTTGCTTGAAAGAGGCGATCGCTGGTCAGATAAAAGCGGCCAACAAGAG 

TCTTGGGGAGGAAGATTCAGTGTCTGGTGTTGGGAGGTTTGAGGGGTCGAGGCTCAAGTT 

CGTGGACCACCACTTGAGAC^GCAAAGAGCTCTTCAACAACTGGGAATGATTCAACATCC 

TTCC^TAATGCnTGGAGACCTCAACGTGGTCTCCCAGAACGAGCCGTCTCAGTTCTCCG 

TGCTTGGCTCTTCGAACACTTTCTTCATCCATACCCTAAGGATTCGGACAAGCACATGCT 

AGCTAAGCAAACAGGACTCACTCGTAGGCAGGTGTCGAACTGGTTTATAAACGCGAGAGT 

TCGGTTATGGAAACCAATGGTGGAGGAGATGTACATGGAGGAAATGAAGGAGCAGGCAAA 

GAACATGGGATCCATGGAAAAGACTCCTTTGGATCAAAGCAACGAAGATTCTGCTTCAAA 

GTCAACAAGTAACC AAGAAAAGAG CC CAATGGCGGACACTAATTAC CATATGAATCCCAA 

TCACAACGGTGACCTAGAAGGCGTCACTGGAATGCAAGGATGCCCCAAGAGACTAAGAAC 

CAGCGACGAGACAATGATGCAGCCAATAAATGCGGATTTCAGCTCCAACGAGAAGCTCAC 

GATGAAGATTCTAGAAGAACGGCAAGGGATAAGATCAGATGGTGGCTACCCTTTCATGGG 

TAATTTCGGGCAATACCAAATGGATGAGATGTCAAGATTTGATGTAGTCTCAGACCAGGA 

GCTCATGGCGCAAAGGTACTCAGGAAACAACAATGGCGTGTCCCTCACGTTAGGTTTACC 

TCATTGTGATAGCTTGTCGTCCACGGACCATCAGGGTTTCATGCAGACCCACCATGGGAT 

TCCTATAGGGAGAAGAGTGAAAATAGGAGAAACAGAGGAATATGGACCCGCCACCATCAA 

TGGTGGTAGCTCGACCACAACCGCACATTCATCAGCGGCAGCTGCCGCGGCTTACAATGG 

GATGAACATACAGAACCAGAAGAGATATGTGGCTCAGTTATTGCCCGACTTCGTTGCATA 

AACCCATCTCTCTAGAAGGAGAAACCGAAACAGGTTATTATATACGTTTCTAGTTTTTAA 

ITAGTATATAGTTTCTCATACCATTGAACCAAAACAAAGAACAAAATTTAATO 

TTGGTTATATATGGCCGACGGGCTACGTCAGGGCCCTGACGTAGC 

>G1589 Amino Acid Sequence {conserved domain in AA coordinates : 3 84 -44 8) 
MAAYFHGNPPEISAGSDGGLQTLILMNPTTWQYTQQD^ 

NNNSSFVFLDSHAPQPNASQQFVGIPLSGHEAASITAADNISVLHGYPPRVQYSLYGSHQ 
VDPTHQQAACETPRAQQGLSLTLSSQQQQQQQHHQQHQPIHVGFGSGHGEDIRVGSGSTG 
SGVTNG IANLVSSKYLKAAQELLDEWNADSDDMNAKS QLFS SKKGS CGNDKP VGES SAG 
AGGEGSGGGAEAAGKRPVELiGTAERQE I QMKKAKLSNMLHEVEQRYRQYHQQMQMVI S S F 
EQAAGIGSAKSYTSLALKTISRQFRCLKEAIAGQIKAANKSLGEEDSVSGVGRFEGSRLK 
FVDHHLRQQRALQQLGMIQHPSNNAWRPQRGLPERAVSVTjRAWLFEHFLHPYPKDSDKHM 
LAKQTGLTRSQVSNWFINARVRLWKPMVEEMYMEEMKE^^^ 

KSTSNQEKSPMADTNYHMNPNHNGDLEGVTGMQGCPKRLRTSDETMMQPINADFSSNEKL 
TMKIIiEERQGIRSDGGYPFMGNFGQYQMDEMSRFDWSDQEIJ^QRYSGNNNGVSIiTLGL 
PHCDSLSSTDHQGFMQTHHGIPIGRRVKIGETEEYGPATINGGSSTTTAHSSAAAAAAYN 
GMNT QNQKRYVAQLLPD FVA* 
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>G160 (38..784) 

TCAAATTTGTCATTTGTTTATTCAAATTT^ 

TCAGAAAATAGAGATGAAAAAAATGGAAAACGAAAGCAACCTTCAGGTTACTTTCTCAAA 
AAGAAGATTCGGTCTTTTCAAAAAAGCTAGTGAACTTTGCACATTAAGTGGTGCAGAGAT 
TCTGTTGATTGTGTTCTCTCCTGGTGGGAAAGTGTTCT 

AGAACTC^TTCATCGCTTTTCGAATCCTAACCATAATTCTGCCATTGTCCATCATCAG^ 

CAAC^TCTCC/^CTTGTTGAAACCCGTCCGGATAGAAATATCCAATATCTCAACAATA 

ACTCACTGAGGTGCTGGCAAACCAGGAAAAGGAGAAACAGAAGAGAATGGTTTTGGACCT 

ATTG7U\AGAATCCAGAGAACAAGTAGGAAACTGGTATGAAAAAGATGTGAAAGATCTCGA 

CATGAATGAAACCAACCAGCTGATATCTGCTCTTCAAGATGTGAAAAAGAAACTGGTAAG 

AGAAATGTCTCAATATTCTCAAGTAAATGTTTC 

CGTGATTGGTGGTGGTAATGTTGGCATTGATCTTTTTGATCAAAGAAGAAATGCATTCAA 
CTATAATCCAAACATGGTGTTTCCCAATCATACACCACC^ 

TGGAGTTCTCGTTCCGATATCCAACATGAACTACATGTCAAGTTACAACTTCAACCAGAG 
OTAGAGTCTGAAGCTAGAAGAACATCCTAATCAATATTTGCGTTATTTTGGCTATGGTTA 
CTGTTAGGATTGTTCTTGTATTGTGAGACTTAAGTTTGTTTTTTCTTTTAATTTGTTTCA 
GTTGGTTGGTTTTTCATTTTATTCGTCGTTTGTTTTCCTTTGTTTTTGGATATTTTTGTA 
TCCCAGAATAAATTTATTTATCCTTTAAAAA 

>G160 Amino Acid Sequence (domain in AA coordinates: 7-62) 

MWSTKGRQKIEMKKMENESNLQVTFSKRRFGLFKKASELCTLSGAEILLIVFSPGGKVF 

SFGHPSVQELIHRFSNPNHNSAIVHHQNl^QLVETRPDR^ 

QKRMVLDLLKESREQVGNWYEKDVKD^ 

NYFGQSSGVIGGGWGIDLFDQRRNAFNYNPNMW 

SSYNFNQS* 

>G1636 (19.. 666) 

GAGTAATCATCAACGATTATGGCGTCAAGTCAGTGGACGAGGTCGGAGGATAAGATGTTT 
GAGCAAGCTTTGGTTCTTTTTCCTGAAGGATCTCCTAATCGGTGGGAGAGAATCGCTGAT 
CAGCTTCATAAATCTGCTGGTGAAGTTAGGGAGCATTACGAGGTCTTGGTTCATGATGTT 
TTCGAGATTGATTCTGGTCGAGTTGATGTCCCTGATTACATGGATGACTCGGCGGCTGCG 
GCGGCGGGTTGGGATTCCGCTGGTCAGATCTCTTTTGGGTCTAAACATGGCGAGAGTGAA 
CGCAAAAGAGGAACTCCTTGGACAGAGAACGAACACAAATTGTTTCTGATCGGATTAAAG 
AGATATGGTAAGGGAGATTGGAGGAGTATCTCGAGAAACGTTGTGGTGACGAGGACACCG 
ACGCAAGTCGCGAGTCACGCTC^GAAGTATTTTCTGAGACAGAACTCGGTGAAGAAGGAG 
AGGAAAAGGTCGAGCATCCATGATATAACTACGGTTGATGCTACTTTGGCTATGCCTGGG 
TCTAACATGGACTGGACTGGCCAACACGGGAGTCCTGTTCAGGCGCCGCAGCAGCAACAG 
ATTATGTCTGAGTTCGGTCAGCAATTGAATCCTGGTCATTTCGAGGATTTTGGGTTTCGG 
ATGTGATG 

>G1636 Amino Acid Sequence (domain in AA coordinates: 100-165) 
MASSQWTRSEDKMFEQALVLFPEGSPNRWERIADQLHKSAG 

RVDVPDYMDDSAAAAAGWDSAGQISFGSKHGESERKRGTPWTENEHKLFLIGLKRYGKGD 
WRSISRNVVVTRTPTQVASHAQKYFLRQNSVKKERK^^ 
GQHGSPVQAPQQQQIMSEFGQQLNPGHFEDFGFRM* 
>G1642 (1..1077) 

ATGGGTCATCACTCATGCTGCAACAAGCAAAAGGTGAAGAGAGGGCTTTGGTCACCTGAA 
GAAGACGAAAAGCTCATCAACTACATCAATTC^TATGGCCATGGATGTTGGAGCTCTGTT 
CCTAAACATGCAGGTTTGCAGAGATGTGGAAAGAGTTGTAGATTAAGATGGATAAATTAT 
CTAAGACCTGATCTTAAACGTGGAAGCTTCTCTCCTCAAGAAGCTGCTCTTATCATTGAG 
CTTCACAG(^TTCTTCGTAACAGATGGGCTCAAAT^ 

GATAACGAGGTCAAGAATTTCTGGAACTCGAGCATTAAAAAGAAGCTCATGTCTCACCAT 

CATCACGGTCATCATCATCATCATCTCTCTTCCATGGCGAGTTTGCTCACAAACCTTCCT 

TATCACAATGGATTCAACCCTACTACAGTCGACGATGAAAGTTCAAGATTCATGTCCAAT 

ATCATCACAAACACTAACCCTAATTTCATCACTCCAAGCCATCTCTCTCTTCCTTCTCCT 

CATGTTATGACCCCATTGATGTTCCCAACCTCTAGAGAAGGAGATTTCAAGTTTCTAACC 

ACAAACAACCCAAACCAATCTCATCACCATGATAATAACCATTACAACAACCTCGACATT 

TTGTCACCCACACCAACTATAAACAATCATCATCAACCTTCACTTTCTTCTTGTCCT 

GATAATAATCTCCAATGGCCAGCGTTACCAGATTTCCCAGCGAGTACCATTTCTGGTTTC 

C^GAAACCCTTCAAGATTATGATGATGCTAATAAACTCAACGTGTTTGTGACACCATTC 
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AACGATAATGCCAAAAAGTTATTATGTGGAGAAGTTCTCGAAGGCAAAGTACTATCTTCC 
TCCTCACCAATTTCACAAGATCACGGCCTTTTTCTTCC 

ACTTCTACGAGTGATCATCAACATCATCATCGAGTGGACTCATACATCAATCACATGATC 
ATACCATCATCATCCTCATCGTCGCCAATCTCTTGTGGACAGTACGTCATAACTTAA 

>G1642 Amino Acid Sequence (domain in AA coordinates: TBD) 
MGHHSCCNKQKVKRGLWSPEEDEKLINYINSYGHGCWSSVPKHAGLQRCGKSCRLRWINY 
LRPDLKRGSFSPQEAALIIELHSILGNRWAQIAKHLPGRTDNEVKNFWNSSIKKKLMSffl 
HHGHHHHHLSSMASLLTNLPYHNGFNPT^^ 

hvmtplmfptsregdfkflttnnpnqshh™^ 
dnnlqwpalpdfpastisgfqetlqdyddanklnvfvt^ 

SSPISQDHGLFLPTTYNFQMTSTSDHQHHHRVDSYINHMIIPSSSSSSPISCGQYVIT* 
>G1747 (1..777) 

ATGAAAATGATGCAAGAGGAGGGAAACCGAAAAGGTCCATGGACAGAACAGGAAGACATA 
CTTCTGGTAAATTTTGTTCACTTATTTGGAGATCGACGATGGGATTTTATAGCAAAAGTA 
TCAGGTTTGAACAGAACAGGAAAGAGTTGCAGGCTAAGATGGGTTAATTACCTACATCCT 
GGTCTCAAACGTGGCAAGATGACGCCTCAAGAAGAGCGCCTCGTCCTTGAGCTTCACGCT 
AAGTGGGGAAACAGGTGGTCGAAAATAGCCCGAAAATTGCCGGGACGAACGGATAACGAG' 
ATAAAGAACTACH'GGAGGACTCATATGAGGAAGAAAGCTCAAGAAAAGAAGCGTCCTGTT 
TCCCCAACTTCCTCATTTTCCAACTGCAGCTCGT^TCTGTGACCACTACCACCACCAAT 
ACTCAAGATACATCGTGCCACTCGCGTAAATCTTC^ 

GGAGGTTCCCGATCCACTAGAGAGATGAATCAAGAAAACGAAGACGTGTACTCGTTGGAT 

GATATATGGAGAGAGATTGATCACTCAGGAGTAAACATAATAAAACCGGTTAAAGACATC 

TACTC^GAACAAAGCCATTGCTTAAGTTACCCAAATCTAGCTTCACCATCATGGGAAAGC 

TCATTGGATTCTATATGGAACATGGATGCAGATAAAAGTAAGATATCGTCTTACTTTGCA 

AATGATCAGTTTCCTTTCTGTTTCCAACACAGTAGATCACCATGGTCGTCAGGTTAA 

>G1747 Amino Acid Sequence (domain in AA coordinates: 11-114) 

MK1#1QEEGNRKGPWTEQED ILLVNFVHLFGDRRWDF IAKVSGLNRTGKS CRLRWVNYLHP 

GLKRGKMTPQEERLVLELHAKWGmWSKIARKLPGRTDNEIKNYWRTH^ 

SPTSSFSNCSSSSVTTTTTNTQDTSCHSRKSSGEVSFYDTGGSRSTREMNQENEDVYSLD 

DIWRBIDHSAWIIKPVKDIYSEQSHCLSYPNIASPSWESSLDSIWMyiDADKSKISSYFA 

NDQFPFCFQHSRSPWSSG* 

>G1749 (59.. 535) 

CAACACTTCTCAGTGACCGTGAGCAACGAATTATTTTCAGTTCAACGACTCCGCGGAAAT 
GGAAAATTCAGAAAATGTTCCCTCTTACGATCAAAACATCAATTTCACTCCTAATTTGAC 
GAGAGATCAAGAACATGTGATCATGGTCTCTGCTTTGC7^ACAAGTAATATCCAACGTCGG 
AGGTGACACGAACTCGAATGCATGGGAAGCTGATCTTCCACCTTTGAACGCTGGCCCTTG 
TCCTCTTTGTAGTGTCACCGGCTGCTACGGTTGCGTCTTCCCACGACACGAGGCGATAAT 
TAAGAAGGAGAAGAAGCACAAAGGAGTGAGGAAAAAACCATCAGGTAAATGGGCGGCGGA 
GATATGGGATCCGAGTTTGAAAGTAAGGAGATGGCTTGGAACGTTTCCAACAGCGGAGAT 
GGCGGCTAAGGCTTACAACGATGCGGCGGCTGAGTTTGTCGGAAGAAGATCAGCAAGACG 
TGGCACAAAGAACGGAGAGGAAGCATCTACCAAGAAGACGACTGAGAAAAATTAACGGAG 
AAGGAGCACGTATAGAAAGGCAGGAAGAGGCATCTTACTTGCTTCACAAGTAAATCAGAA 
TTTTTTTGA7VAAGTAAAAACGTTATTTTGTTTGGTAATAAAATAAAGTAAAACAAAATAT 
TGCTAACGCAAGACTTATCAAGTTCAGTCGTGACTGTGAGTGTGTTTTTATGTATCTTAC 
TTCATTTTTTGTCTTTCAATTGTGTGTGTGTGTGT 

>G1749 Amino Acid Sequence (conserved domain in AA coordinates : 84-155) 
MENSENVPSYDQNINFTPNLTRDQEHVIMVSALQQVISNVGGDTNSNAWEADLPPLNAGP 
CPLCSVTGCYGCVF-PRHEAI IKKEKKHKGVRKKPSGKWAAE I WDPSLKVRRWLGTFPTAE 
MAAKAYNDAAAEFVGRRSARRGTKNGEEASTKKTTEKN* 
>G1751 (117.. 923) 

AAACACAAACAAAACTCATATTTTCAATCTCCAGGTGCTTTACACCAACAGAGTCGCAAG 
AAAACAAAAACCAAACTCGGATTTAGTTTGACAGAAGAAGGAATCGAGAGTCGGGTATGC 
ATTATCCTAACAACAGAACCGAATTCGTCGGAGCTCCAGCCCCAACCCGGTATCJ\AAAGG 
AGCAGTTGTC^CCGGAGCAAGAGCTTTGAGTTATTGTCTCTGCTTTGCAACACGTGATCT 
CAGGGGAAAACGAAACGGCGCCGTGTCAGGGTTTTTCCAGTGACAGCACAGTGATAAGCG 
CGGGAATGCCTCGGTTGGATTCAGACACTTGTCAAGTCTGTAGGATCGAAGGATGTCTCG 
GCTGTAACTACTTTTTCGCGCCAAATCAGAGAATTGAT^AAGAATCATCAACAAGAAGAAG 
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AGATTACTAGTAGTAGTAACAGAAGAAGAGAGAGCTCTCCCGTGGCGAAGAAAGCGGAAG 

GTGGCGGGAAAATCAGGAAGAGGAAGAACAAGAAGAATGGTTACAGAGGAGTTAGGCAAA 

GACCTTGGGGAAAATTTGC7^GCTGAGATC^GAGATCCTAAAAGAGC(^CACGTGTTTGGC 

TTGGTACTTTCGAAACCGCCGAAGATGCGGCTCGAGCTTATGATCGAGCCGCGATTGGAT 

TCCGTGGGCCAAGGGCTAAACTCAACTTCCCCirTGTGGATTAGACGTCTTCAGTTTCAT 

CTCCTGTTGCTGCTGATGATATAGGAGCAAAGGCAAGTGCAAGCGCCAGTGTGAGCGCCA 

CAGATTCAGTTGAAGCAGAGCAATGGT^ACGGAGGAGGAGGGGATTGCAATATGGAGGAGT 

GGATGAATATGATGATGATGATGGATTTTGGGAATGGAGATTCTTCAGA^ 

CAATTGCTGATATGTTCCAGTGATAAATGAGCTCTTTC 

AGTGCAAGAAGAGATTGACACTGTGGCTTGTTTAAAGTGAACAAGAACAAGAAAGCATGT 

AATTAGTAGTCTCATTCTTTTGTTTGTGGTCAATTCTATGTTTATCTCATATAAAATCTG 

AGTTAAACCTATCTGAGGAGAGAGTAAATAAAGAGGTTAAGAA 

>G1751 Amino Acid Sequence (domain in AA coordinates: TBD) 

MHYPNNRTEFVGAPAPTRYQKEQLSPEQELSVIVSALQHVISGENETAPCQGFSSDSTVI 

SAGMPRLDSDTCQVCRIEGCLGCNYFFAPNQRIEKNHQQEEEITSSSNRRRESSPVAKKA 

EGGGKIRKRKNKKNGYRGVRQRPWGKFAAEIRDPKRATRVWLGTFETAEDAARAYDRAAI 

GFRGPRAKLNFPFVDYTS S VS SPVAADDIGAKASASASVS ATDS VEAEQWNGGGGDCNME 

EWMNMMMMMDFGNGDSSDSGNTIADMFQ* 

>G1752 (25.. 756) 

AAAAAAAAAAAAAAAAAAAAACTTATGGAATATTCCCAATCTTCCATGTATTCATCTCCA 
AGTTCITGGAGCTCATCACAAGAATCACrCTTATGGAACGAGAGCTGTTTCTTGGATC^ 
TCATCTGAACCTCAAGCCTTCTTTTGCCCTAATTATGATTACTCCGATGACTTTTTCTCA 
TTTGAGTCACCGGAGATGATGATTAAGGAAGAAATTCAAAACGGCGACGTTTCTAACTCC 
GAAGAAGAAGAAAAGGTTGGAATTGATGAAGAAAGATCATACAGAGGAGTGAGGAAAAGG 
CCGTGGGGGAAATTTGC^GCGGAGATAAGAGATTCAACGAGGAATGGAATTAGGGTTTGG 
CTCGGGACATTTGACAAAGCCGAGGAAGCCGCTCTTGCTTATGATCAAGCGGCTTTCGCC 
ACAAAAGGATCTCTTGCAACACTTAATTTCCCGGTGGAAGTGGTTAGAGAGTCGCTAAAG 
AAAATGGAGAATGTGAATCTTCATGATGGAGGATCTCCGGTTATGGCCTTGAAGAGAAAA 
CATTCTCTTCGAAACCGGCCTAGAGGGAAAAAGCGATCCTCTTCTTCTTCTTCTTCTTCT 
TCTAATTCTTCTTCTTGCTCTTCTTCTTCGTCT^ 

AAGCAGAGTGTTGTGAAGCAAGAAAGTGGTACACTTGTGGTTTTTGAAGATTTAGGTGCT 
GAGTATTTAGAACAACTTCTTATGAGCTC^VrGTT(3ATCTTGTAATTGATTTCAGCAAAAG 
CCACTATTAAACTTTAATTTTGTGATAATTAATCTTGAAATTTGTTTTGTTCATTCTGCA 
ATTTCTTTGGTTCTCTTATTTTTTGTTTGTTGTATCCAAATGAAATTATTGGAAGAGATG 
GTGATGTTAAAGTGTATATATATAAAAAAAAAA 

>G1752 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEYSQSSMYSSPSSWSSSQESLLWNESCFLDQSSEPQAFFCPNYDYSDDFFSFESPEMMI 
KEEIQNGDVSNSEEEEKVGIDEERSYRGVRKRPWGKFAAEIRDSTRNGIRVWLGTFDKAE 
EAALAYDQAAFATKGSIiATTiNFPVEVVRESLK^^ 

GKKRSSSSSSSSSNSSSCSSSSSTSSTSRSSSKQSVVKQESGTLVVFEDLGAEYLEQLLM 
SSC* 

>G1763 (3 3.. 97 7) CTTCGGTGGTGGCCACG 
GCGGCGAGCTTATGGAAGCACTTCAACCTTTTTACAAAAGTGCTTCCACGTCTGCTTCAA 
ATCCTGCGTTTGCGTCCTCAAACGATGCGTTTGCGTCTGCCCCAAACGACCCATTTTCTT 
CTTCTTCTTACTATAATCCTCATGCATCTTTCTTCCCTTCACATTCCACAACCACTTACC 
CGGATATTTATTCTGGATCCATGACCTATCCATCTTCATTCGGGTCGGATCTTCAACAAC 
CCGAAAACTACCAASCTCAGTTCCATTACCAAAAC^ 

ACAACACTTGCATGCTCAACTTCATTGAGCCGAGCCAACCGGATTTTATGACCCAACCGG 
GTCCGAGTTCGGGTTeGGTTTCAAAACCGGCTAAGCTCTATAGAGGAGTGAGGCAAAGAC 
ATTGGGGAAAATGGGTCGCGGAGATCCGTTTACCCAGGAACCGAACCCGACTTTGGCTCG 
GAACATTCGACACGGCTGAAGAAGCCGCGTTGGCTTATGATCGCGCCGCGTTTAAGCTTC 
GTGGTGACTCGGCTCGGCTTAACTTCCCAGCTCTCCGATACCAAACCGGCTCGTCTCCGT 
CTGACGTTGGCGAATACGGACCTATTCAAGCTGCCGTTGACGCCAAGCTAGAAGCCATAT 
TAGCTGAGCCGAAGAATCAGCCGGGCAAAACGGAGAGGACGTCGAGGAAACGAGCTAAAG 
CCGCGGCTTCTTCAGCTGAGCAGCCGTCAGCGCCACAACAACATTCCGGGTCGGGTGAAA 
GTGATGGGTCGGGTTCACCGACTTCGGATGTTATGGTGCAGGAGATGTGCCAAGAGCCAG 
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AGATGCCATGGAATGAAAATTTCATGCTCGGCAAGTGTCCTTCTTATGAGATAGATTGGG 
CTTCAATTTTATCGTGAAAAATTAGGATT 

TATTTTCTTTTAAACTTTAGGGTTATTAGCTGTGCGTAAAATTT 

TATGAATGTAATGCAAGTGTGTAAATTATGGACAGCTCAAGCTTTTTTGTTAAAA 

>G1763 Amino Acid Sequence (conserved domain in AA coordinates : 140-209) 

MADLFGGGHGGELMEALQPF YKS ASTS ASNPAFAS SNDAFASAPNDPFS S S SYYNPHASF 

FPSHSTTTYPDIYSGSMTYPSSFGSDLQQPENYQSQFHYQNTITYTHQDNNTCMLNFIEP 

SQPD FMTQ PG PS SG S VS KP AKL YRG VRQRHWG KWVAE I RLPRNRTRLWLGT FDTAE EAAL 

AYDRAAFKLRGDS ARLNFP ALRYQTGS S PSDVGEYGP I QAAVDAKLEA I LAE PKNQPGKT 

ERTSRKRAKAAASSAEQPSAPQQHSGSGESDGSGSPTSDVMVQEMCQEPEMPWNENFMLG 

KCPSYEIDWASILS* 

>G1766 (32.. 1216) 

AGGCTATTCTCGGAAAAACAAAGAATAAAGAATGAATTCGTTTTCACAAGTACCTCCTGG 

CTTCAGATTTCATCCTACTGATGAAGAACTTGTAGACTACTACTTGAGGAAAAAAGTTGC 

ATCAAAGAGAATAGAAATCGATATCATCAAGGATGTTGATCTTTACAAGATTGAGCCATG 

TGATCTTCAAGAGTTATGCAAGATAGGAAACGAAGAGCAGAGCGAATGGTACTTCTTTAG 

TCATAAAGACAAGAAGTATCCCACGGGAACTCGAACCAATAGAGCCACGAAAGCAGGATT 

TTGGAAAGCCACTGGAAGAGACAAGGCTATATATATAAGACATAGTCTTATCGGTATGAG 

GAAAACACTTGTGTTTTACAAAGG^UVGAGCCCCAAATGGTCAGAAATCCGATTGGATCAT 

GCACGAATATCGCTTAGAAACAAGTGAAAATGGAACCCCTCAGGAAGAAGGATGGGTAGT 

ATGTAGGGTATTCAAGAAGAAATTGGCAGCGACAGTGAGGAAAATGGGAGATTACCATTC 

ATCACCATCGCAGCATTGGTACGATGATCAGCTCTCTTTTATGGCCTCCGAGATCATTTC 

TAGTTCTCCACGACAGTTTCTTCCCAATCATCATTATAACCGCCACCATCACCAGCAGAC 

ATTGCCTTGTGGCCTCAATGCATTCAACAACAACAATCCTAACTTGCAATGCAAGCAAGA 

GCTCGAGTTACATTACAATCAAATGGTACAACATCAACAACAAAACCATCATCTTC^ 

ATCTATGTTTCTCCAGCTTCCTCAGCTCGAAAGCCCTACCAGTAATTGCAATTCTGACAA 

CAACAATAACACAAGAAATATTAGTAACTTGCAGAAATCATCAAATATATCTCATGAGGA 

ACAATTGCAACAAGGGAATCAAAGTTTCAGCTCTCTGTATTACGATCAAGGAGTAGAGCA 

AATGACTACTGACTGGAGAGTTCTCGATAAATTTGTTGCTTCACAGCTTAGCAATGATGA 

AGAGGCTGCAGCCGTGGTTTCTTCTTCTTCTCATCAAAACAACGTCAAGA^ 

AAACACGGGTTATCATGTGATAGATGAGGGAATAAATTTGCCGGAGAATGATTCTGAAAG 

GGTTGTTGAAATGGGAGAAGAGTATTCAAATGCTCATGCTGCTTCTACTTCTTCAAGTTG 

TCAGATTGATCTCTAGAAATAGTGATAGAGAGATGAAAAAGATGCAAGGTGAATATATAT 

GAAAATACATGCACACTAGTGTTATTTATACTTAAAGATGGAAGGGGAAAAACAAGGAGT 

TATTTCCTGGATTTATGGAGGTTTTGTACA.TAATAAAAACCTACAACCATATGGTATTTT 

CTTTTGAAAAAAAAAAAAAAAAAAAAAAAA 

>G1766 Amino Acid Sequence (domain in AA coordinates: 10-153) 

MNSFSQVPPGFRFHPTDEELVT)YYLRKKVASKRIEIDIIKDVDLYKIEPCDLQELCKIGN 

EEQSEWYTFSHKDKKYPTGTRTNRATKAGFWKATGRDKAIYIRHSLIGMRKrLVF 

PNGQKSDWIMHEYRLETSENGTPQEEGWWCRVFKKKLAATVRKM 

LSFMASEIISSSPRQFLPNHHYNRHHHQQIXPCGIjN^^ 

HQQQNHHLRESMFLQLPQLESPTSNCNSDNITNNTRNISNLQKSSNISHEEQL^ 
SLYY1X}GV^QMTTDWRVLDKFVASQLSNDEEAAAWSSSSHQNNV^ 
INLPENDSERWEMGEEYSNAHAASTSSSCQIDL* 
>G1767 (1..1596) 

ATGGATACTCTCTTTAGACTAGTCAGTCTCCAACAACAACAACAATCCGATAGTATCATT 
ACAAATCAATCTTCGTTAAGCAGAACTTCCACCACCACTACTGGCTCTCCACAAACTGCT 
TATCACTACAACTTTCCACAAAACGACGTCGTCGAAGAATGCTTCAACTTTTTCATGGAT 
GAAGAAGACOTTTCCTCTTCTTCTTCTCACCACAACCATCACAACCACAACAATCCTAAT 
ACTTACTACTCTCCTTTCACTACTCCCACCCAATACCATCCCGCCACATCATCAACCCCT 
TCCTCCACCGCCGCAGCCGCAGCTTTAGCCTCGCCTTACTCCTCCTCCGGCCACCATAAT 
GACCCTTCCGCGTTCTCCATACCTCAAACTCCTCCGTCCTTCGACTTCTCAGCCAATGCC 
AAGTGGGCAGACTCGGTCCTTCTTGAAGCGGCACGTGCCTTCTCCGACAAAGACACTGCA 
CGTGCGCAACAAATCCTATGGACGCTCAACGAGCTCTCTTCTCCGTACGGAGACACCGAG 
CAAAAACTGGCTTCTTACTTCCTCCAAGCTCTCTTCAACCGCATGACCGGTTCAGGCGAA 
CGATGCTACCGAACCATGGTAACAGCTGCAGCCACAGAGAAGACTTGCTCCTTCGAGTCA 
ACGCGAAAAACTGTACTAAAGTTCCAAGAAGTTAGCCCCTGGGCCACGTTTGGACACGTG 
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GCGGCAAACGGAGCAATCTTGGAAGCAGTAGACGGAGAGGCAAAGATCCACATCGTTGAC 

ATAAGCTCCACGTTTTGC^CTCAATGGCCGACTCTTCTAGAAGCTTTAGCCAC^GATC^ 

GACGACACGCCTCACCTAAGGCTAACCACAGTTGTCGTGGCCAACAAGTTTGTCAACGAT 

CAAACGGCGTCGCATCGGATGATGAAAGAGATCGGAAACCGAATGGAGAAATTCGCTAGG 

CTTATGGGAGTTCCTTTCAAATTTAACATTATTCATCACGTTGGAGATTTATC 

GATCTCAACGAACTCGACGTTAAACCAGACGAAGTCTTGGCCATTAACTGCGTAGGCGCG 

ATGCATGGGATCGCTTCACGTGGAAGCCCTAGAGACGCTGTGATATCGAGTTTCCGACGG 

TTAAGACCGAGGATTGTGACGGTCGTAGAAGAAGAAGCTGATCTTGTCGGAGAAGAAGAA 

GGTGGCTTTGATGATGAGTTCTTGAGAGGGTTTGGAGAATGTTTACGATGGTTTAGGGTT 

TGCTTCGAGTCATGGGAAGAGAGTTTTCCAAGGACGAGCAACGAGAGGTTGATGCTAGAG 

CGTGCAGCGGGACGTGCGATCGTTGATCTTGTGGCTTGTGAGCCGTCGGATTCCACGGAG 

AGGCGAGAGACAGCGAGGAAGTGGTCGAGGAGGATGAGGAATAGTGGGTTTGGAGCGGTG 

GGGTATAGTGATGAGGTGGCGGATGATGTCAGAGCTTTGTTGAGGAGATATAAAGAAGGT 

GTTTGGTCGATGGTACAGTGTCCTGATGCCGCCGGAATATTCCTTTGTTGGAGAGATCAG 

CCGGTGGTTTGGGCTAGTGCGTGGCGGCCAACGTAA 

>G1767 Amino Acid Sequence (domain in AA coordinates: 255-272) 

MDTLFRLVSLQQQQQSDSIITNQSSLSRTSTTTTGSPQTAYHYNFPQNDVVEECFNFFMD 

EEDLSSSSSHHNHHNHNNPNTYYSPFTTPTQYH^ 

DPSAFSIPQTPPSFDFSANAKWADSVLLEAARAFSDKDTAI^QQILWTLNELSSPYGDTE 
QKEiAS YFLQALFNRMTGSGERCYRTMVTAAATEKTCS FESTRKTVLKFQEV PWATFGHV 
AANGAILEAVDGEAKIHIVDISSTFCTQWPTLLEALAra 
QTASHRMMKEIGNRMEKFARLMGVPFKFNIIHHVGDLSEFDLNELD 
MHGIASRGSPI^AVISSFRRLRPRIVTVVEEEADLVGEEEGGFDDEFLRGFGECLRWFRV 
CFESWEESFPRTSNERLMLERAAGRAIVDLVACEPSDSTERRETARKWSRRMRNSGFGAV 
G YSDEVADD VRALLRRYKEGVW SMVQC PDAAGI FLCWRDQP WWAS AWRPT * 
>G1778 (1..627) 

ATGATGGGATACC^AACAAACTCTAATTTCTCCATGTTTTTTTC 
CAAAACCACCACAACTACGATCCTTATAATAATTTCTCTO 

ACTCTCTCACTTGGAACACCCTCTACrCGTCTCGACGACC^CCATAGATTTTCTTCTGCT 
AATTCTAACAACATCTCCGGCGACTTTTATATTCACGGAGGAAACGCTAAGACTTCTTCG 
TACAAGAAGGGTGGTGTTGCTCATAGCCTACCTCGCCGTTGTGCTAGCTGCGACACCACT 
TCAACTCCTCTATGGAGAAACGGACCAAAAGGACCTAAGTCGTTATGTAACGCGTGTGGA 
ATCCGATTCAAGAAAGAGGAGAGGCGTGCGACGGCCAGAAACTTAACGATCTCCGGTGGA 
GGTTCATCAGCGGCAGAAGTCCCAGTAGAGAATTCGTACAACGGAGGTGGAAACTATTAC 
AGTCATCATC^TCATCACTATGCCTCGTCGTCGCCGTCGTGGGCTCATCAGAACACACAA 
AGAGTTCCATATTTCTCACCGGTTCCGGAGATGGAATATCCCTACGTGGATAACGTCACG 
GCTTCTTCTTTTATGTCTTGGAATTGA 

>G1778 Amino Acid Sequence (domain in AA coordinates : 94-119) 
MMGYQTOSNFSMFFSSENDDQNHHim^ 

NSNNI SGDFYIHGGNAKTS S YKKGGVAHSLPRRCAS CDTTSTPLWRNGPKGPKS LCNACG 
IRFKKEERRATARNLTI S GGGS S AAEVPVENS YNGGGNYYSHHHHHYAS S S PS WAHQNTQ 
RVPYFS PVPEME YPYVDNVTAS S FMSWN* 
>G1789 (108.. 413) 

C^GGACTCTGCGACATCTGTGC^C^TATCATTTCCTCAGAATCTCTTTCTTTTCTAGG 

TTTATTACTACAGAAAACCAAAGATCATCAACTTTAGTTACTAAACAAT 

CAATGTCTTCTTATGGCTCTGGCTCATGGACTGTTAAGCAGAACAAAGCCTTTGAGCGTG 

CTCTAGCAGTCTATGACCAAGACACTCCGGACCGTTGGCACAATGTTGCTAGAGCTGTTG 

GTGGTAAAACACCAQAAGAAGCTAAGAGACAGTATGACCTTCTAGTTCGTGACATCGAAA 

GCATCGAGAATGGTCACGTGCCATTCCCTGACTACAAGACTACTACAGGAAACAGCAACA 

GAGGCAGGCTGCGTGATGAGGAAAAGAGGATGAGAAGCATGAAGCTGCAGTGAGACAAGA 

AGCAACAAAACCTAACTACGTATGATCGTCAAAATAAAAGAGAATCACTTCAGAGAGATG 

TGTTTTTTTCAATGTCTGACGAATCAATGTTTTTTTCTTGC^TTTCTCATGTTTTTCCC 

TAAGAAATGGTTTTTTTTTCGAGGCAACAAAAAAAAAA 

>G1789 Amino Acid Sequence (domain in AA coordinates: 1-50) 
MASGSMSSYGSGSWTVKQNKAFERAIiAVYDQOTPDRWHW 
RD IBS I ENGHVPFPDYKTTTGNSNRGRLRDEEKRMRSMKLQ * 
>G1790 (63.. 1346) 
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GAAAAAGACTTCACTTTTTTTTT 

CAATGGAGAATTTCGTCGACGAGAATGGTTTTGCTTCTCT^ 

GTGATCAAGAAC^CATGAAAGAAGAAGATTTTCCATTCGAAGTCGTCGACCAATCAAAAC 
CTACAAGCTTTCTTCAAGATTTTCACCATCTTC 

ATCATCATGGCTCCTCATCTTC^CATCCTTTGCTCAGCGTCCAAACTACGTCTTCTTGT^ 
TCAATAATGCTCCTTTCGAGCATTGCTCTTACCAAGAAAACATGGTCGATTTCTATGAAA 
CTAAACCAAATTTGATGAATCATCATCATTTCCAAGCAGTGGAAAACTCATACTTCACTC 
GTAATCATCATCATCATCAAGAGATCAATTTGGTCGATGAACATGATGATCCTATGGACT 
TGGAGCAAAACAACATGATGATGATGAGGATGATCCCTTTTGATTACCCTCCTACAGAGA 
CTTTCAAAC C TATGAACTTCGTAATG C CAGATGAAATTTCATGTGTTTCTG CAGATAATG 
ATTGTTATAGAGCAACGAGTTTCAACAAGACCAAAC 

CTTCTTCTTCATCATCATCATGGAAAGAAACCAAAAAGTCAACCTTAGTCAAAGGACAAT 

GGACTGCTGAAGAAGACAGGGTACTGATTCAACTCGTGGAGAAGTATGGATTGCGTAAAT 

GGTCGCATATCGCTCAAGTGTTACCGGGAAGAATCGGGAAACAATGTAGAGAGAGGTGGC 

ATAACCATTTGAGACCTGACATTAAGAAAGAAACATGGAGTGAAGAAGAGGACAGAGTGT 

TGATAGAATTTCACAAAGAGATTGGAAACAAATGGGCAGAGATTGCGAAAAGACTCCCGG 

GAAGAACAGAGAACTCGATCAAGAACCATTGGAACGCAACAAAAAGAAGACAATTCT 

A^GAAAATGTAGATCTAAGTATCCAAGACCTTOTCTGTTGCAGGATTACATCAAGAGCT 

TGAATATGGGAGCTTTGATGGCTTCTTCTGTTCCTGCAAGAGGTAGACGCAGAGAGAGTA 

ATAACAAGAAGAAGGATGTTGTTGTTGCGGTTGAGGAGAAGAAGAAGGAAGAGGAGGTGT 

ATGGAC^GAC^GGATTGTGCCTGAATGTGTGTTTACTGATGATTTTGGATTCAATGAGA 

AGCTGCTTGAGGAAGGATGTAGCATTGACTCTTTGCTTGATGACATTCCTCAGCCTGACA 

TTGATGCTTTTGTTC^TGGGCTCTGATTTGTATTTTTTATTCTGCTTGTTTCAGTTTTGT 

TGTTTTTTGTTTGTCTTTTTATACGAGACAGATTCCA 

ATATAAAATATTTTGCTTTTTAAAAAAAAAAAAAAAAAAAAAAA 

>G1790 Amino Acid Sequence (conserved domain in AA coordinates : 217-316) 
MENFVDENGFASLNQNIFTRDQEHMK^ 

HHGSSSSHPLLSVQTTSSCIl^APFEHCSYQENMVDFYETKPN^ 
NHHHHQEINLVDEHDDPMDLEQNNMMMMRMIPFDYPPTO 

CYRATS FNKTKPFLTRKLS SS S S SS SWKETKKSTLVKGQWTAEEDRVL I QLVEKYGLRKW 
SHIAQVXjPGRIGKQCRERWHNHLRPDIKKETWSEEEDRVX 

RTENSIKNHWNATKRRQFSKRKCRSKYPRPSLIjQDYIKSLNMGALMASSVPARGRRRE 
NKKKDVWAV^EKKKEEEVYGQDRIVPEC^FTDDFGFNEKLLEEGCSIDSLLDDIPQPDI 

DAFVHGL* 

>G1791 (36.. 455) 

ATGTACATGCAAAAACAAAAACCTTAAAAGCTTTCATG^^ 

CGAATGAGATGAAATACAGAGGCGTACGAAAGCGTCCATGGGGAAAATATGCGGCGGAGA 
TTCGCGACTCAGCTAGACACGGTGCTCGTGTTTGGCTTGGGACGTTTAACACAGCGGAAG 
ACGCGGCTCGGGCTTATGATAGAGCAGCTTTCGGCATGAGAGGCCAAAGGGCCATTCTCA 
ATTTTCCTCACGAGTATCAAATGATGAAGGACGGTC 

TGGCTTCCTCGTCGTCGGGATATAGAGGAGGAGGTGGTGGTGATGATGGGAGGGAAGTTA 
TTGAGTTCGAGTATTTGGATGATAGTTTA1TGGAGGAGCTTTTAGATTATGGTGAGAGAT 
CTAACCAAGACAATTGTAACGACGCAAACCGCTAGATCATCACTACTTACTTACAGTGTA 
ATGTTTTTGGAGTAAAGAGTAATAATCAATATAATATACTTTAGTTTAGGAAAAAAAAAA 
AAAAAAAAA 

>G179i Amino Acid Sequence (domain in AA coordinates: TBD) 
MERIES YNTNEMKYRGVRKRPWGKYAAE IRDS ARHGARWIjGTFNTAEDAARAYDRAAFG 
MRGQRAILNFPHEYQMMKDGPNGSHENAVASSSSGYRGGGGGDDGREVIEFEYLDDSLLE 
ELLDYGERSNQDNCNDANR* 
>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAAACACAAAATAAACAGATTTGACTCAAAAAGAAGAAAAT 
GAATTCTAACAACTGGCTTGGCTTTCCTCTTTCACCGAACAACTCTTCTTTGCCTCCTCA 
TGAATACAACCTTGGCTTGGTCAGCGACCATATGGAC^ 

GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 
CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 
CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 
CGTTGTAGCAGCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 
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GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 

TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 

CGTTGTTGAGACGGCCACGCCAAGACGTGCATTGGACACnTTCGGACAACGAACCTCGAT 

CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 

TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 

CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 

AACTACTACTAATTTCCCCATTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 

GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 

TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCT^AGCAAGGATCGG* 

CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 

AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 

GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 

CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 

GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 

CACCTC^TCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAACAACCATTAGAGCC 

TTTTCTATCTCTTCAGAACAATGACATCTCT 
CTCCTCTTTTAATCACCATAGCTATATCC 

CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTC^GCAGCTCTACAATC 
TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCT 

CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 
TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 

GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence {conserved domain in AA coordinates : 179-255, 281- 
MNSNNWLGFPLS PNNS SIiPPHEYNLGLVSDHMDNPFQTQEWNM INPHGGGGDEGGEVPKV 
ADFLGVSKPDENQSNHLVAYNDSDYYFH^ 

ES AHNLQS LTLSMGTTAGNNVVDKAS PSETTGDNASGGALAVVETATPRRAIiDTFGQRTS 

IYRGVTRHRWTGRYEAHLVTONSCRREGQSRKGRQW 

STTTNFPITNYEKEVEEMKHMTRQEFVAAI 

GRVAGNKDLYLGTFSTEEEAAEAYD IAAI KFRGLNA VTNFEINRYDVKAILE S STLP IGG 

GAAKRLKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

PFLSLQNNDlSHYNNl^AHDSSSFNHHSYIQTQIiH^ 

HSNPALLHGLVSTSIVDNNNNNGGSSGSYNTAAFLGira 

YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 

>G1795 (27.. 422) 

ACAAACACGCAAAAAGTCATXAATATATGGATCAAGGAGGTCGAGGTGTCGGTGCCGAGC 
ATGGAAAGTACCGGGGAGTTCGGAGACGACCTTGGGGAAAATATGCAGCAGAGATACGAG 
ATTCGAGGAAGCACGGTGAACGTGTGTGGCTTGGAACGTTCGATACGGCAGAGGAAGCGG 
CTAGAGCCTATGACCAAGCTGCTTACTCCATGAGAGGCCAAGCAGCAATCCTTAACTTCC 
CTCATGAGTATAACATGGGGAGTGGTGTCTCTTCTTCCACCGCCATGGCTGGATCTTCCT 
CCGCCTCCGCCTCCGCTTCTTCTTCTTCTAGGCAAGTTTTTGAATTTGAGTACTTGGATG 
ATAGTGTTTTGGAGGAGCTCCTTGAGGAAGGAGAGAAACCTAACAAGGGCAAGAAGAAAT 
GAGCGAGATATAATTCATGATTATTTCTAA 

>G1795 Amino Acid Sequence (domain in AA coordinates: 12-80) 
MDQGGRGVGAEHGKYRGVRRRPWGKYAAEIRDSW 

SMRGQAAILNFPHEYNMGSGVSSSTAMAGSSSASASASSSSRQVFEFEYLDDSVLEELLE 

EGEKPNKGKKK* - 
>G1800 (61. .894) 

CCATTATC^TATCCTeTTCTTCCTTCTT(^CTATCAATCTTCTTCTCCACTAC^CAC^ 
ATGGAGAAATCATCCTCAATGAAACAATGGAAGAAGGGTCCTGCTCGGGGTAAAGGCGGT 
CCACAAAACGCTCTTTGTCAGTACCGTGGAGTCAGGCAAAGGACTTGGGGCAAATGGGTG 
GCTGAGATCAGAGAGCCCAAGAAGAGGGCAAGACTTTGGCTTGGCTCTTTCGCTACAGCT 
GAAGAAGCAGCTATGGCTTATGATGAGGCTGCCTTGAAACTCTATGGGCACGACGCATAC 
CTCAACTTACCTCATCTTCAGCGGAATACAAGACCTTCTCTGAGTAACTCTCAGAGGTTC 
AAATGGGTACCTTCAAGGAAGTTTATATCTATGTTTCCTTCATGTGGTATGCTAAACGTG 
AATGCTCAGCCTAGTGTTCACATAATCCAGCAAAGACTAGAAGAACTCAAGAAAACTGGA 
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CTTTTATCTCAATCCTATTCTTCT 

TTTCTTGATGAGAAGACCAGCAAGGGAGAAACAGACAATATGT^ 

AAGAAACCAGAGATCGACCTGACCGAGTTTCTTCAGCAACTAGGAATCTTGAAGGATGAA 
AATGAAGCAGAACCAAGTGAGGTAGCAGAGTGTCATTCCCCTCCACCATGGAACGAGCAA 
GAAGAAACTGGAAGTCCTTTCAGAACTGAGT^TTTCAGCTGGGATACCCTGATCGAGATG 
CCAAGAAGTGAAACCACAACTATGCAATTTGACrCCAGCAAC 

GAGGATGATGTATCCTTCCCTTCCATCTGGGACTACTACGGAAGCTTAGATTGAGTAAAA 

GC^TTTAAGGTAGATCAAGATTCAGAAGTACACAAATGGTTTTGGATTTAGTG^ 

TTTGGAAAAGAGAC^TAGGTAGTGAGAGTGCAGTCTTTTATTATGCAGCAATAAAGTGAG 

TGAGTGTACAACCGAGTTGTTCGCTTTTTTTGGT^ 

CGCTAAAAAAAAAAAAAAAAAAAAAAA 

>G1800 Amino Acid Sequence (domain in AA coordinates: TBD) 
MEKS S SMKQWKKGPARGKGGPQNALCQYRGWQRTWGK^AE IREPKKRAIUjWLGS FATA 
EEAAMAYDEAAiKLYGHDAYLNLPHLQRlJTRPSLS 

NAQPSVHIIQQRLEELKKTGLLSQSYSSSSSSTESKTNTSFLDEKTSKGETDNMFEGGDQ 
KKPEIDLTEFLQQLGILKDENEAEPSEVAECHSPPPWNEQEETGSPFRTENFSWDTLIEM 
PRS ETTTMQFDS SNFGS YDFEDDVSFP S I WD YYGSLD * 
>G1806 (1..1356) 

ATGCAGAGCAGCTTCAAAACCGTTCCTTTCACTCCTC 

TTCTTCAGAGGAGATAGTTGTCTTGAGGAGTTTCATCAACCAGTCAATGGTTTTCACCAT 
GAAGAAGCTATCGATTTAAGTCCAAATGTCACTATTGCTTCAGCTAACTTACACTACACG 
ACGTTTGATACGGTTATGGATTGTGGTGGTGGTGGTGGTGGCTTGAGGGAGAGACTTGAA 
GGAGGAGAAGAGGAGTGTTTGGACACAGGGCAATTAGTGTACCAGAAAGGGACAAGATTA 
GTAGGAGGAGGAGTAGGAGAAGTGAAC^GCAGTTGGTGTGATTCGGTTTCAGCTATGGCT 
GATAACAGTCAACATACTGACACTTCCACAGATATTGATACTGATGACAAGACTCAGTTG 
AATGGAGGTCATCAAGGGATGCTATTGGCTACAAATTGTTCAGATCAATCCAATGTGAAA 
TCTAGTGATCAAAGGACACTTCGTCGACTTGCTCAGAACCGGGAGGCTGCTAGGAAAAGT 
CGGTTGAGGAAAAAGGCCTATGTTCAGCAACTTGAGAATAGTCGAATCAGGCTTGCACAG 
CTAGAGGAAGAGCTCAAAAGAGCTCGCCAAC^GGGATCTTTGGTTGAAAGAGGAGTTTCA 
GCGGATCACACGCATTTGGCAGCAGGAAATGGTGTCTTTTCATTTGAATTGGAATATACA 
CGTTGGAAGGAGGAACATCAAAGAATGATCAACGACTTAAGATCGGGTGTGAATTCGCAG 
TTAGGTGACAACGATCTACGCGTTCTAGTGGATGCTGTGATGAGTCACTATGATGAAATA 
TTCAGGCTAAAGGGAATTGGCACTAAAGTTGAAGTCTTTCATATGCTCTCAGGCATGTGG 
AAGACACCTGCCGAGAGATTTTTC^TGTGGTTAGGTGGATTTAGATCATCAGAGTTACTT 
AAGATATTGGGGAACGATGTGGATCCATTGACGGACCAGCAGTTGATAGGCATTTGCAAC 
CTTCAGC^TCGTCTCAAC^GCAGAGGATGCATTGTC^CAAGGCATGGAAGCTCTAC^ 
CAATCACl^CTCGAGACGCTTTCTTCTGCTTCTATGGGTCCAAACTCTTCAGCAAATGTT 
GCAGATTATATGGGTCATATGGCTATGGCTATGGGCAAACTTGGCACTCTTGAAAACTTC 
CTTCGCCAGGCTGATTTATTGAGGCAACAAACTCTGCAACAGCTTCACAGAATTCTCACC 
ACACGACAAGCTGCTCGCGCCTTTTTGGTCATCCACGATTATATTTCTCGGCTTAGAGCA 
CTTAGCTCTCTATGGTTAGCCAGACCTAGAGACTAA 

>G1806 Amino Acid Sequence (domain in AA coordinates 165-225) 

MQSSFKTVPFTPDFYSQSSYFFRGDSCLEEFHQPVNGFHHEEAIDLSPNVTIASANLHYT 

TFDTVMDCGGGGGGLRERLEGGEEECLDTGQLVYQKGTRLVGGGVGEVNSSWCDSVSAMA 

DNSQHTDTSTDIDTDDKTQLNGGHQGMLLATNCSDQSNVKSSDQRTLRRLAQNREAARKS 

RLRKKAWQQLENSRIRLAQLEEELKRARQQGSLVERGVSADHTHLAAGNGVFSFELEYT 

RWKEEHQRM INDLRSGVNS QLGDNDLRVLVDAVMSHYDE I FRLKGIGTKVEVFHMLSGMW 

KTPAERFFMWLGGF^SSELLKILGNHVDPLTDQQLIGICNLQQSSQQAEDALSQGMEALQ 

QSLLETLSSASMGPNSSANVADYMGHMAMAMGKLGTLENFLRQADLLRQQTLQQLHRIIjT 

TRQAARAFLVIHDYISRLRALSSLWLARPRD* 

>G1811 (93. .827) 

AAAGGAGCATTGGTATCTCAAACAATATTTGCCCTTTCTCTATCTCTCTCTCATCACTAT 

TTGCCATCTCTTTCTCTCTCCCTCTCTTTCAAATGTCAATAAACCAATACTCAAGCGATT 

TCCACTACCATTCTCTCATGTGGCAACAA(^GCAGCAACAACAACAACACCAAAAC 

TCGTGGAAGAAAAAGAAGCTCTTTTCGAGAAACCCTTAACCCCAAGTGACGTCGGAAAAC 

TCAACCGCCTCGTCATCCCAAAACAGCACGCCGAGAGATACTTCCCACTAGCGGCCGCCG 

CCGCAGACGCCGTGGAGAAAGGACTTCTCCTCTGCTTTGAGGACGAGGAAGGTAAACCAT 
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GGAGATTCAGATACTCGTACTGGAACAGTAGCCAGAGTTATGTCTTGACCAAAGGCTGGA 

GCAGATACGTCAAGGAGAAGCACCTTGACGCCGGAGACGTCGTTCTCTTCCATCGACACC 

GTTCAGACGGCGGAAGATTCTTCATTGGCTGGAGAAGACGCGGTGACTCTTCTTCCTCCT 

CCGACTCTTATCGCCATGTTCAATCCAATGCCTCGCTCCAATATTATCCTCATGCAGGGG 

CTCAAGCGGTGGAGAGCCAAAGAGGCAACTCGAAGACATTAAGACTGTTCGGAGTGAACA 

TGGAGTGCCAGCTAGATTCGGACTGGTCCGAGCCATCCACACCTGACGGTTCTAACACAT 

ATACyU\CCAATCACGACCAGTTTCATTTCTACCCTCAACAACAACACTATCCTCCTCCGT 

ACTACATGGACATAAGTTTCACAGGAGATATGAACCGGACGAGCTAGAAGCCCACAAGGA 

TTAAAAAAAAGCTTCACATCTGGTCCTGTTATGTTGTCATAGATGTTGATTCCTTAATTT 

TACACAAGCTTCATTTTGC^TTATTTAAAGTAAAATCGTATTT^ 

TCTCTCAATTTTCACTCTCTTCCTTTT^ 

ACACTTGTATAGAGAATTCAAAGTTCTGGCTATTTTCGAAAGTTATCTTTTCTCTTAAAA 
AAAAAAA 

>G1811 Amino Acid Sequence (domain in AA coordinates: TBD) 
MS INQYSSDFHYHSLMWQQQQQQQQHQNDV\^EKEALFEKPLTPSDVGKLNRLVI PKQHA 
ERYFPLAAAAADAVEKGLLLCFEDEEGKPWRFRYSYWNSSQSYVLTKGWSRYVKEKHLDA 
GDWLFHRHRSDGGRFF IGWRRRGDS S S S SDS YRHVQSNASLQYYPHAGAQAVESQRGNS 
KTLRLFGVNMECQLDSDWSEPSTPDGSNTYTTNHDQFHFYPQQQHYPPPYYMDISFTGDM 

NRTS* 

>G182 (74 -.1366) 

CGTCGACGATCAGATTCTTGCGTATAGCTGTATATATACAC^^ 

TCATATATAGATTATGTGCAGCGTCTCTGAGCTTCTTGACATGGAAAACTTCCAAGGAGA 

CTTAACCGACGTCGTACGAGGAATCGGAGGCCACGTGTTATCACCGGAGACTCCTCCCTC 

GAACATCTGGCCTCTTCCTCTGTCACATCCAACACCATCACCGTCAGATCTTAACATAAA 

CCCCTTCGGAGATCCCTTTGTGAGCATGGACGATCCACTCCTCCAAGAACTAAACTCCAT 

CACAAACTCCGGCTATTTCTCCACCGTAGGAGATAACAACAACAACATTCACAA 

TGGTTTCTTGGTTCCAAAGGTATTTGAGGAGGATC^TATAAAGAGTCAATGTAGTATCTT 

CCCAAGAATCCGGATCTCGCATAGTAACATCATCCACGATTCTTCTCCGTGTAATTCTCC 

GGCCATGTCGGCTCACGTTGTCGCAGCCGCAGCAGCCGCCTCGCCGAGAGGCATCATCAA 

CGTAGACACAAACAGTCCTAGAAACTGTCTATTGGTTGATGGTACCACGTTCTCOT 

GATTCAGATATCTTCCCCTCGGAATCTAGGCCTTAAAAGAAGGAAGAGTCAGGCAAAGAA 

GGTGGTGTGTATTCCGGCCCCGGCTGCAATGAACAGCCGATCAAGCGGAGAAGTGGTTCC 

ATCGGATCTATGGGCTTGGCGTAAATACGGTCAAAAACCTATCAAAGGCTCTCCTTTTCC 

AAGGGGTTATTATAGATGCAGCAGCTCAAAAGGTTGTTCAGCAAGAAAGCAAGTCGAAAG 

AAGCCGAACCGATCCAAACATGTTGGTGATTACATATACCTCCGAACATAACCATCCTTG 

GCCCATCC^CGC^CGCTCTCGCCGGCTCCACACGCTCCrCCACCTCCTCCTCATCTAA 

CCCTAATCCTTCCAAACCCTCAACCGCAAACGTAAACTCCTCATCCATTGGCTCCCAAAA 

OICCATCTACTTGCCTTCCTCC^CCACTCCTCCTCCTACCCTCTCATCCTCCGCCATCAA 

AGATGAACGAGGGGACGATATGGAGTTGGAAAACGTAGATGATGATGATGATAACCAGAT 

TGCTCCATACAGACCGGAGCTTCATGATCATCAGCACCAACCAGATGATTTCT^ 

TCTTGAAGAGCTAGAAGGAGATTCTCTAAGCATGTTGCTTTCTCATGGCTGTGGCGGCGA 

CGGGAAGGATAAAACGACCGCGTCCGATGGGATCAGCAATTTCTTCGGGTGGTCGGGAGA 

TAATAATTATAATAATTACGACGACCAAGACTCAAGGTCGTTATAGTATAGTGTTAATTA 

CAGGTAAAC^AATTATATTAAATTAAGTTGAGCTTGTGAAAATGAAGATCATATGGTCTG 

GTCAGGTTGGGGGC 

>G182 Amino Acid Sequence (conserved domain in AA coordinates :217-276) 

MCSVSELLDMENFQGDLTDVVRGIGGHVLSPETPPSNIWPIiPLSHPTPSPSDLNINPFGD 

PFVSMDDPLLQELNSITNSGYFSTVGDNNNNI^ 

ISHSNIIHDSSPCNSPAMSAHVVAAAAAASPRGIINWTNSPRNCLLVDGTTFSSQIQIS 

S PRNLGLKRRKSQAKKWC IPAPAAMNSRS SGEWPSDLWAWRKYGQKP IKGS PFPRGYY 

RCSSSKGCSARKQVERSRTDPNMLVITYTSEHNHPWPIQRNALAGSTRSSTSSSSNPNPS 

KPSTANVNSSSIGSQNTIYLPSSTTPPPTLSSSAIKDERGDDMELENVDDDDDNQIAPYR 

PELHDHQHQPDDFFADLEELEGDSLS^LSHGCGGDGKDKTTASDGISNFFGWSGDNNYN 

NYDDQDSRSL* 

>G1835 (1..969) 

ATGATTGGAACAAGCTTCCCCGAGGATCTTGATTGTGGCAACTTCTTTGACAACATGGAT 
GATCTCATGGACTTTCCCGGTGGAGATATCGATGTCGGTTTCGGCATAGGTGACTCCGAC 
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TCTTTCCCTACCATCTGGACCACTCATCACGACACGTGGCCTGCCGCTTCTGATCCTCTC 
TTCTCTTCCAACACCAACTCTCATTCATCACCTGAGCTCTATGTTCCGTTTGAGGAC^TT 
GTTAAGGTGGAAAGACCTCC^GCTTTGTAGAGGAAACATTGGTTGAGAAGAAGGAAGAT 
TCGTTTTCGACAAACACTGATTCATCATCTTCTCATAGCCAATTCAGGAGCTCAAGTCCA 
GTGTCGGTTCTCGAGAGCAGCTCCTCCTCGTCTCAAACCACCAACACAACCTCCCTTGTr 
CTCCCTGGAAAGCACGGTCGTCCACGCACAAAACGCCCTCGTCCACCTGTCCAGGATAAA 
GATAGAGTCAAAGACAATGTGTGCGGTGGTGACTCGCGCCTCATCATTAGAATACCGAAA 
CAGTTTCTCTCTGATCACAACAAGATGATCyUlCAAGAAGAAGAAGAAGAAGGCCAAGATT 
ACTTCTTCCTCTTCTTCGTCCGGGATTGATCTTGAAGTCAATGGAAACAACGTCGATTCG 
TATTCTTCAGAGCAATATCCGCTTAGGAAATGTATGCACTGTGAGGTCACCAAGACTCCA 
CAGTGGAGGCTTGGTCCAATGGGTCCAAAGACACTTTGCAATGCGTGCGGTGTACGTTAC 
AAATCAGGGAGGCTTTTCCCGGAGTACCGTCCAGCTGCTAGTCCT^CATTTACTCCAGCT 
CTTCACTCAAACTCACACAAGAAAGTGGCTGAAATGAGAAACAAGAGATGCAGTGATGGT 
AGCTACATAACCGAAGAGAATGATCTGCAAGGGCTGATTCCGAACAATGCCTACATTGGC 

GTAGACTAA 

>G1835 Amino Acid Sequence (domain in AA coordinates: 224-296) 

MIGTSFPEDLDCGNFFDNMDDLMDFPGGDIDVGFGIGDSDSFPTIWTTHHDTWPAASDPL 

FSSNTNSDSSPELYVPFEDIVKNTORPPSFVEETLVEK^ 

VSVLESSSSSSQTTNTTSLVLPGKHGRPRTKRPRPPVQDKDRVKDNVCGGDSRLIIRIPK 
QFLSDHNKM INKKKKKKAKITSSS S S S G IDLE VNGNNVDS YS SEQYPLRKCMHCE VTKTP 
QWRLGPMGPKTLOTACGTOYKSGRLFPEYRPAASPTFTPALHSNSHKKVAEMRNKRCSDG 

SYITEENDLQGLIPNNAYIGVD* 
>G1836 (47.. 610) 

ATAACAAGCCTAGAACACTAGAAACTTCAAAAAAGAAAAAAATCTTATGGAGAACAACAA 
CGGCAAC^CCAGCTGCCACCGAAAGGTAACGAGCAACTGAAGAGTTTCTGGTCAAAAGA 
GATGGAAGGTAACTTAGATTTCAAAAATCACGACCTTCCTATAACTCGTATCAAGAAGAT 
TATGAAGTATGATCCGGATGTGACTATGATAGCTAGTGAGGCTCCAATCCTCCTCTCGAA 
AGCATGTGAGATGTTTATCATGGATCTCACGATGCGTTCGTGGCTCCATGCTCAGGAAAG 
CAAACGAGTCACGCTACAGAAATCTAATGTCGATGCCGCAGTGGCTCAAACTGTTATCTT 
TGATTTCTTGCTTGATGATGA(^TTGAGGTAAAGAGAGAGTCTGTTGCCGCCGCTGCTGA 
TCCTGTGGCCATGCCACCTATTGACGATGGAGAGCTGCCTCCAGGAATGGTAATTGGAAC 
TCCTGTTTGTTGTAGTCTTGGAATCCACCAACCACAACCACAAATGCAGGCATGGCCTGG 
AGCTTGGACCTCGGTGTCTGGTGAGGAGGAAGAAGCGCGTGGGAAAAAAGGAGGTGACGA 
CGGAAACTAATAAGTGGAATACGTTTTAGGGTATTTTCAAGGGAATATGTAGTAAATAGT 

CATGGATC 

>G1836 Amino Acid Sequence (domain in AA coordinates: 30-164) 
MENNNGNNQLPPKGNEQLKSFWSKEMEGNLDFKNHD^^ 

ILL S KACEMF I MDLTMRS WLHAQES KRVTLQKSNVD AAVAQTV I FD FLLDDD I EVKRE S V 
AAAADPVAMPPIDDGELPPGMVTGTPVCCSLGIHQPQPQMQAWPGAWTSVSGEEEEARGK 

KGGDDGN* 

>G1838 (132. .1628) 

TTCCTTGGCATTCTCTTTAGAACTTTCGTACAAAATGCAAAACCTGAACCTCTAAAGCTA 
AAAAAAAAGATTAGAGACTGTAACTGCTTTTATCAGATTTTCAACTAGGAAAAAAGTTAC 
AATCTTTTTTGATGGCTCC^CAATGAC 

AGATGTTGAAATCAACTGATCAGTCTCACTTCTCTTCTTCTTACGACGATTCTTCTACTC 
CTTATCTCATCGATAACTTCTATGCTTTCAAAGAAGAAGCTGAGATAGAAGCTGCTGCTG 
CTTCAATGGCGGATTCAACAACCTTATCTACTTTTTTCGATCATTCTCAGACTCAGATTC 
CAAAGCTGGAAGATOTCCTCGGTGATTCCTTTGTCCGTTACTCTGATAACCAAACAGAGA 
CCCAAGACTCTTCTTCTCTCACTCCATTCTACGATCCACGTCACCGCACCGTTGCCGAAG 
GAGTTACAGGGTTCTTCTCTGATCATCATCAGCCAGATTTCAAGACGATAAACTCGGGAC 
CAGAAATCTTCGATGACTCAACAACTTCCAACATCGGTGGTACTCATCTCTCCAGTCACG 
TGGTGGAGTCATCAACGACGGCGAAGTTAGGGTTTAACGGTGATTGCACCACCACCGGAG 
GAGTTTTGTCTCTAGGGGTTAACAACACATCAGATCAACCTTTGAGCTGTAACAATGGCG 
AGAGAGGTGGAAACAGTAACAAGAAGAAAACAGTTTCTAAGAAGGAAACATCAGATGATT 
CAAAGAAGAAGATTGTCGAAACATTGGGACAAAGAACTTCAATTTATCGTGGAGTCACCC 
GACATAGATGGACTGGAAGATACGAAGCGCATCTATGGGATAACAGCTGTAGGAGGGAAG 
GTCAAGCCAGAAAAGGACGTCAAGTGTACTTAGGTGGATATGACAAGGAAGATAGAGCAG 
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CTAGAGCCTATGACTTGGCAGCTTTAAAATACTGGGGTTCTACTGCTACTACAAATTTTC 

CGGTCTCGAGTTAT^CAAAAGAACTTGAGGAAATGAATCACATGACCAAGCT^GAGTTTA 

TTGCATCTCTTAGGAGGAAAAGTAGCGGTTTTTCGAGAGGAGCTTCAATATATAGAGGTG 

TCACAAGGCATCATCAACAAGGTCGCTGGCAAGCAAGAATCGGCCGTGTCGCAGGAAACA 

AAGATCTTTACCTCGGAACCTTTGCAACCGAAGAGGAAGCAGCAGAGGCTTATGACATTG 

CAGCCATAAAGTTCAGAGGAATCAACGCAGTAACTAACTTTGAGATGAACAGGTATGACA 

TTGAAGCTGTCATGAATAGTTCTTTACCTGTAGGAGGAGCAGCTGCGAAACGCCACAAAC 

TCAAACTCGCTCTTGAATCTCCTTCTTCATCATCCTCTGACCATAACCTCCAACAACAAC 

AGTTGCTTCCGTCCTCTTCTCCCTCGGATCAAAACCCTAACTCAATCCCATGTGGCATTC 

CATTTGAGCCTTCAGTTCTCTATTACCACCAGAACTTCTTTCAGCATTATCCTTTGGTCT 

CTGACTCTACAATTCAAGCTCCTATGAACCAAGCTGAGTTTTTCTTGTGGCCTAACCAG 

(nTACTAAATCATTTGGTTCGTTCTTGCTTAGACTTCTATTCACCGCACTAACCGATGAC 

CCGAGGCTTATCTTCTTGATTCTGGCTATAAGGATGAATCTTTCAAGTTCCTTTTTTAAC 

TGTAGGCTAAGACAGAAGTAGAGGGGAGAAAAGTTGAAGAATCTGAAACTTTTGGGGTCA 

ATTTTGTATTAATGTTTTTCTTTTGT 

TGAATGTAAT CGGCCTATAACGGTATAACTCTGTTTC CATTTATG AATATTTTTCTCAAA 
TTGAAAAAAAAAAAAAAAAAA 

>G1838 Amino Acid Sequence (conserved domain in AA coordinates : 229-305 , 330-400) 
MAPPMTNCLTFSIjS PMEMLKSTDQSHFS S S YDDS STPYLIDNF YAFKEEAE I E AAAASMA 

dsttlstffdhsqtqipkledflgdsfvrysdnqtetqdsssltpfydprhrtvaegvtg 

ffsdhhqpdfktinsgpeifddsttsniggthlsshvvessttaklgfngdctttggvls 

lgvl^tsdqplsca^gerggnsnkkktvskketsddskkkiv^ 

tgryeahlwdnscrregqarkgrqvylggydkedraaraydi^^ 

ys keleemnhmtkqef i aslrrks sgfsrgas i yrgvtrhhqqgrwqar igrvagnkdly 

lgtfateeeaaeaydiaaikfrginavtnfemnrot 

lespsssssdhnlqqqqllpssspsdqnpnsipcgipfepsvlyyhqnffqhyplvsdst 
iqapmnqaefflwpnqsy* 

>G1843 (51.. 653) 

CAGACATCACAATCAAATTAGGTCAGAAGAATTAGTCGGAGAAAACAGCCATGGGAAGAA 
GAAAAGTAGAGATCAAACGAATTGAGAACAAAAGCTCTCGACAAGTTACTTTCTGTAAAC 
GACGAAATGGTCTCATGGAGAAAGCTCGTCAACTCTCAATTCTTTGTGAATCCTCCGTCG 
CTCTTATO^TCATCTCTGCCACCGGAAGACTCTACAGCTTCTCCTCAGGTGATAGCATGG 
CCAAGATCCTCAGTCGTTATGAATTAGAACAGGCTGATGATCTTAAAACCTTGGATCTAG 
AAGAAAAAACTCTTAATTATCTTTCGCACAAGGAGTTGCTAGAAACAATCCAATGCAAGA 
TTGAAGAAGCGAAAAGCGATAATGTAAGTATAGATTGTCTAAAGTCCCTGGAAGAGCAGC 
TCAAGACTGCTCTGTCTGTAACTAGAGCTAGGAAGACAGAACTAATGATGGAGCTTGTGA 
AGACCCATCAAGAGAAGGAGAAGCTGCTGAGAGAGGAGAACCAGAGTTTGACTAACCAG^ 
TTATAAAGATGGGGAAGATGAAGAAGTCTGTGGAAGCAGAGGATGCAAGAGCAATGTCAC 
CGGAAAGTAGCTCTGACAACAAGCCACCGGAGACTCTCCTGCTTCTCAAGTAACCACCAT 
CACCAACGACTGATTCGAAAAATAAAAATTGTAAAAATTATGATTTGTAGTTCATAAGGA 
AAGCTAC^TACTGTATGTTAAAAATCCTCTTCTTCCCCCTGCTACGGAAAAGTC^TCC^A 
GGAGATGCATCAAATAAAGTAATTGATTTTTATTGTTA 

>G1843 Amino Acid Sequence (domain in AA coordinates: 2-57) 

mgrrkveikrienkssrqvtfckrrnglmekarqlsilcessvaliiisatgrlysfssg 

dsmakilsryeleqaddlktldleektlnylshkelletiqckieeaksdnvsidclksl 

eeqlktalsvtrarktelmmelvk™^ 

amspesssdnkppetllllk* 

>G1853 (1..18G0-) 

ATGAGAGGTTCTTGGTACAAGAGTGTTTCCTCTGTTTTTGGTCTCAGACCACGGATCAGA 
GGGTTGTTATTCTTCATTGTTGGTGTTGTGGCTCTAGTTACTATTTTAGCACCATTGACA 
TCTAATTCGTATGATTCTTCGTCAAGTTCGACACTTGTGCCGAACATTTATAGTAACTAT 
AGGAGGATAAAGGAGCAAGCTGCTGTTGATTATCTTGATCTGAGGTCTCTTTCTTTAGGG 
GCTAGTTTAAAAGAGTTTCCTTTTTGTGGTAAAGAAAGAGAAAGTTATGTGCCTTGTTAT 
AACATAACTGGGAATTTGCTTGCTGGGCTTCAAGAGGGTGAGGAGTTAGATCGACATTGC 
GAGTTTGAAAGAGAGAAGGAAAGATGTGTAGTTCGTCCTCCGAGAGATTATAAAATACCA 
CTTAGGTGGCCACTTGGTAGAGATATCATATGGAGTGGGAACGTGAAGATTACCAAAGAC 
CAGTTTCTTTCTTCAGGAACTGTGACAACGAGGTTAATGTTGCTTGAAGAGAATCAAATA 
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ACCT1TCACTCGGAGGACGGCCTGGTCTTT6ATGGGGTCAAAGACTATGCTCGTCAAATT 

GCTGAGATGATAGGTTTAGGAAGTGATACTGAATTTGCTCAAGCGGGTGTACGGACTGTG 

TTAGACATTGGTTGCGGATTTGGTAGCTTTGGTGCTCATTTAGTGTCTTTGAAGCTGATG 

CCTATATGTATTGCTGAGTATGAGGCAACTGGGAGCCAAGTTCAGTTAGCTCTAGAGAGA 

GGCCTTCCTGCAATGATTGGGAATTTCTTTT 

TTTGATATGGTCCATTGTGCTCAATGTGGCACT^ 

CTTTTGGAAGTGGATCGTGTTCTGAAACCCGGG 

AACAAAGCACAGGGAAACTTACCAGATACCAAGAAAACGAGCATCTCAACACGGGTGAAT 

GAGTTATCTAAGAAAATCTGTTGGAGTCTAACAGCTCAGCAGGATGAGACGTTTCTTTGG 

CAGAAAACTTCTGATTCZAAGTTGCTATTCTTCTCGTTCGCAAGCTTCTATACCT 

AAAGATGGAGATAGCGTTCCGTATTACCACCCZATTGGTTCCATGTATAAGCGGAACCACG 

AGTAAACGCTGGATTTCTATACAGAACAGGTCTGCTGTTGCAGGAACAACCTCTGCCGGG 

CTTGAAATTCATGGTTTAAAACCGGAAGAATTCTTCGAGGATACACAAATATGGAGATCA 

GCTCTCAAAAACTATTGGTCCTTGCTTACACCTCTAATTTTCTCTGACCATCCGAAGAGA 

CCCGGTGATGAGGATCCTCTCCCGCCTTTCAACATGATACGCAATGTGATGGACATGCAT 

GCTCGTTTTGGGAATTTAAATGCCGCTTTACTCGACGAAGGAAAATCTGCTTGGGTAATG 

AACGTCGTCCC^GTCAATGCACGTAATACTCTTCCTATCATACTTGATCGTGGTTTCGCC 

GGTGTTCTACATGACTGGTGTGAACCATTCCCGACATATCCTCGAACATATGACATGCTT 

CATGCCAATG AACTTCTCACACATC TTAGCTCAGAACGATGCAGC CTAATGGACTTGTTC 

TTGGAGATGGACCGGATTCTTCGCCCTGAGGGATGGGTTGTTCTAAGCGACAAAGTGGGA 

GTAATCGAGATGGCTCGAGCACTTGCAGCTCGAGTGCGTTGGGAAGCAAGAGTCATTGAT 

CTTCAAGATGGTAGTGACCAAAGACTTCTCGTCTGTCAAAAACCATTCATCAAAAAATAA 

>G1853 Amino Acid Sequence (domain in AA coordinates: entire protein) 

MRGSWYKSVSSVFGLRPRIRGLLFFIVGWALVTILAPLTSNSYDSSSSSTLVPNIYSNY 

RRIKEQAAVDYLDLRSLSLGASLKEFPFCGKERESYVPCYNITGNLLAGLQEGEELDRHC 

EFEREKERCWRPPRDYKI PLRWPLGRDI IWSGNVKITKDQFLSSGTVTTRLMIiLEENQI 

TFHSEDGLVFDGVKDYARQIAEMIGLGSOT-EFAQAGVRTVLDIGCGFGSFGAHLVSLK^ 

PICIAEYEATGSQVQLALERGLPAMIGNFFSKQLPYPALSFDMVHCAQCGTTWDIKDAML 

LLEVDRVLKPGGYFVLTSPTNKAQGNLPDTKKTS ISTRVNELSKKI CWSLTAQQDETFLW 

QKTSDSSCYSSRSQASIPLCKDGDSVPYYHPLVPCISGTTSKRWISIQNRSAVAGTTSAG 

LEIHGLKPEEFFEDTQIWRSALKNYWSLLTPLIFSDHPKRPGDEDPLPPFl^IRNVMDMH 

ARFGNLNAALLDEGKS AWVMNVVPVNARNTLP 1 1 LDRGFAGVLHDWCEPFPTYPRTYDML 

HANELLTHLSSERCSLMDLFLE^RI^^ 

LQDGSDQRLLVCQKPFIKK* 

>G1855 (1..1902) 

ATGGCGAAAGAGAACAGTGGTCATCATCACCAAACAGAAGCAAGAAGAAAGAAACTAACT 
TTGATTCTTGGTGTAAGTGGACTCTGCATTTTGTTCTATGTTTTAGGTGCZATGGCAAGCC 
AATACCGTCCCATCTTCTATCTCGAAGCTCGGATGCGAGACGCAATCAAACCCTTCTTCG 
TCCTCTTCCTCTTCCTCATCTTC^^ 

ATTGAGTTAAAGGAAACAAACCAAACC^TTAAGTACTTTGAACCATGTGAATTATCTCTC 

AGTGAGTACACTCCTTGTGAAGACCGACAAAGAGGAAGAAGATTCGATAGGAACATGATG 

AAATATAGAGAAAGAC^TTGTCCTGTAAAAGATCAGCTTCTTTATTGTTTGATTCCTCCT 

CCACCAAACTACAAGATTCCATTTAAATGGCCACAAAGTAGAGACTATGCTTGGTATGAC 

AATATCCCTCACAAGGAACTTAGTGTTGAGAAAGCAGTTCAAAACTGGATTCAAGTTGAA 

GGTGACCGCTTTAGATTCCCTGGTGGTGGTACTATGTTTCCTCGTGGAGCTGATGCTTAT 

ATCGATGATATTGCTAGGCTTATTCCTCTTACTGATGGTGGAATCAGAACAGCTATTGAC 

ACTGGATGTGGTGTTGCAAGTTTTGGTGCTTACCTCTTGAAGAGAGACATTATGGCTGTG 

TCTTTTGCTCCAAGAGACACTCATGAAGCTCAGGTACAGTTTGCTTTAGAACGCGGAGTT 

CCTGCGATAATCGGGATTATGGGATCAAGAAGACTTCCTTATCCAGCTAGAGCTTTTGAT 

CTTGCTCATTGTTCTCGTTGTTTGATCCCTTGGTTTAAAAATGATGGTTTGTACCTTATG 

GAGGTCGACCGGGTTTTAAGACCGGGCGGTTACTGGATCCTCTCGGGACCACCGATTAAC 

TGGAAACAGTACTGGAGAGGGTGGGAGAGAACAGAGGAGGATTTGAAGAAAGAGCAAGAT 

TCAATAGAAGATGTAGCAAAGAGTCTTTGCTGGAAGAAAGTAACTGAAAAAGGTGACTTA 

TCAATTTGGCAAAAGCCTCTCAATCACATTGAGTGTAAAAAGCTCAAACAAAACAATA^ 

TCACCTCCGATATGCAGCTCAGATAACGCGGATTCCGCTTGGTACAAAGACTTGGAAACT 

TGTATAACACCATTACCAGAAACAAACAATCCAGATC 

GATTGGCCAGACCGAGCATTCGCGGTACCTCCAAGAATCATCAGAGGAACTATACCAGAA 



98 



WO 03/013227 



99/286 



PCT/US02/25805 



ATGAACGCGGAGAAATTTAGAGAAGACAACGAGGTTTGGAAAGAGAGAATAGCACATTAC 
AAGAAGATAGTCCCTGAGCTTTCACATGGAAGATTCAGGAACATTATGGACATGAACGCT 
TTTCTCGGCGGATTCGCTGCTTCCATGCTGAAATATCCCTCATGGGTCATGAACGTTGTC 
CCGGTCGATGCAGAGAAACAAACGTTAGGTGTGATCTACGAACGTGGATTGATAGGGACG 
TATCAAGATTGGTGTGAAGGATTCTCAACGTATCCAAGAACTTATGATATGATTCATGCA 
GGAGGATTGTTCAGCTTATACGAACATAGGTGTGATTTGACGTTGATATTGTTGGAGATG 
GATCGAATTTTGAGACCAGAAGGAACAGTTGTGTTGAGAGATAATGTGGAGACGTTGAAT 
AAGGTAGAGAAGATAGTGAAGGGAATGAAGTGGAAGAGTCAAATTGTTGATCATGAGAAA 
GGTCCTTTTAATCCTGAGAAGATTCTTGTTGCTGTTAAAACTTATTGGACTGGTCAACCT 
TCTGACAAGAACAACAACAACAACAACAACAACAACAACTAG 

>G1855 Amino Acid Sequence (domain in AA coordinates: entire protein) 
MAKENS GHHHQTEARRKKLTL I LGVSGLCI LF YVLGAWQANTVPS S I SKLGCETQSNPS S 
SSSSSSSSESAELDFKSHNQIELKETNQTIKYFEPCELSLSEYTPCEDRQRGRRFDRNMM 
KYREI^CPVKDELLYCLIPPPPNYKIPFKWPQSRDYAWYDNIPHKELSVEKAVQNWIQVE 

GDRFRFPGGGTMFPRGADAY IDD I ARLI PLTDGG I RTAIDTGCGVAS FGAYLLKRD IMAV 

SFAPRDTHEAQVQFALERGVPAIIGIMGSRRIiPYPARAFDLAHCSRCLIPWFKNDGLYLM 

EVDRVLRPGGYWILSGPPINWKQYWRGWERTEEDLKKEQDSIEDVAKSLCWKKVTEKGDL 

SIWQKPLNHIECKKLKQNNKSPPICSSDNADSAWYKDLETCITPLPETNNPDDSAGGALE 

DWPDRAFAVPPRIIRGTIPEMNAEKFREDNEWKERIAHYKKJVPELSHGRFRNIMDMNA 

FLGGFAASMLKYPSWVMNWPVDAEKQTLGVIYERGLIGTYQDWCTGFSTYPR 

GGLFSLYEHRCDLTLILLEMDRILRPEGTVVLRD 

GPFOTEKILVAVKTYWTGQPSDKNN*^^ 

>G187 (118.. 1074) 

TAGACCTCTTAGGAAAAAAACCTAAAAACCTAATCCCCAAACCTAAAAGGCTTATCTCAT 
CTCTTCTTCTTTGTCTTCTTTACTCTTTTTTTACCT 

TCTAATGAAACCAGAGATCTCTACAACTACCAATACCCTTCATCGTTTTCGTTGCACGAA 
ATGATGAATCTGCCTACTTCAAATCCATCTTCTTATGGAAACCTCCCATCACAAAACGGT 

TTTAATCC^TCTACTTATTCCT^ 

TCTCTACTTCAGAAAACTTTTGGTCTTTCTCCCTCTTCCTCAGAGGTTTTCAATTCTTCG 
ATCGATCAAGAACCGAACCGTGATGTTACTAATGACGTAATCAATGGTGGTGCATGCAAC 
GAGACTGAAACTAGGGTTTCTCCTTCTAATTCTTCCTCTAGTGAGGCTGATCACCCCGGT 
GAAGATTCCGGTAAGAGCCGGAGGAAACGAGAGTTAGTCGGTGAAGAAGATCAAATTTCC 
AAAAAAGTTGGGAAAACGAAAAAGACTGAGGTGAAGAAACAAAGAGAGCCACGAGTCTCG 
TTTATGACTAAAAGTGAAGTTGATCATCTTGAAGATGGTTATAGATGGAGAAAATACGGC 
CAAAAGGCTGTAAAAAATAGCCCTTATCCAAGGAGTTACTATAGATGTACAACACAAAAG 
TGCAACGTGAAGAAACGAGTGGAGAGATCGTTCCAAGATCCAACGGTTGTGATTACAACT 
TACGAGGGTCAACACAACCACCCGATTCCGACTAATCTTCGAGGAAGTTCTGCCGCGGCT 
GCTATGTTCTCCGCAGACCTCATGACTCGAAGAAGCTTTGCACATGATATGTTTAGGACG 
GCAGCTTATACTAACGGCGGTTCTGTGGCGGCGGCTTTGGATTATGGATATGGACAAAGT 
GGTTATGGTAGTGTGAATTCAAACCCTAGTTCTCACCAAGTGTATCATCAAGGGGGTGAG 
TATGAGCTCTTGAGGGAGATTTTTCCTTCAATTTTCTTTAAGCAAGAGCCTTGATCGATC 
ATTGTTATAACTACATATATTATATATATTGAGAGAGAGAGGTAGAGAAAAAAAAA 

>G187 Amino Acid Sequence (domain in AA coordinates: 172-228) 
MSNETRDLYNYQYPSSFSLHEMMNLPTSNPSSYGNLPSQNGFNPSTYSFTDCLQSSPAAY 
ESLLQKTFGLSPSSSEVFNSSIDQEPNRDVTNDVINGGACNETETRVSPSNSSSSEADHP 
GEDS GKSRRKRELVGEEDQI SKKVGKTKKTEVKKQREPRVS FMTKSEVDHLEDGYRWRKY 
GQKAVKNS PYPRS YYRCTTQKCNVKIQiVERSFQDPTWITTYEGQHNHPIPTNLRGS SAA 
AAMFSADLMTPRSFMDMFRTAAYTNGGSVAAALDYGYGQSGYGSVNSNPSSHQVYHQGG 

EYELLRE I FPS I F FKQEP * 
>G1881 (1. .519) ' 

ATGCGAATTTTGTGTGATGCTTGTGAGAGCGCCGCCGCTATCGTCTTTTGCGCCGCCGAC 
GAAGCTGCCCrcrGTTGCTCCTGCGACGAAAAAGTO 

CGGCATCTTCGTGTAGGCTTAGCTGATCCGAGTAATGCACCAAGCTGTGACATATGCGAA 
AATGCACCCGCATTCTTTTACTGTGAGATAGATGGTAGTTCCCTTTGTCTACAATGTGAT 
ATGGTGGTACATGTTGGTGGGAAGAGAACACATAGGCGGTTTCTATTACTGAGACAGAGA 
ATTGAGTTTCCAGGCGATAAGCCTAATCATGCTGACCAACTGGGACTACGGTGTCAAAAG 
GCTTCCTCTGGTCGTGGTCAAGAATCAAATGGGAATGGTGATCATGATCATAATATGATC 
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GATCTTAACTCCAATCCTCAAAGAGTACACGAGCCTGGATCACATAACCAAGAGGAGGGT 
ATTGATGTAAATAACGCAAACAATCACGAGCATGAATAG 

>G1881 Amino Acid Sequence (domain in AA coordinates : 5-28 , 56-79) 

MRILCDACESAAAIVFCAADEAALCCSCXIEKVH 

NAPAFFYCEIDGSSLCLQCT)MWHVGGKRTHRRFLL^ 

ASSGRGQESNGNGDHDHNMIDLNSNPQRVHEPGSHNQEEGIDVNNANNHEHE* 
>G1882 (1..1200) 

ATGGTTTTTTCTTCATTTCCTACTTATCCTGATCATTC 

CAACCAATCACAACCACCGTTGGATTCACGGGAAATAACATCAACCAACAGTTTCT 

CACCATCCCCTCCCACCGC^CAGCAAC^U^CGCCTCCGCAGCT^^ 

AACGGCGGAGTCGCTGTTCCCGGTGGACCTGGCGGGTTAATCCGACCAGGTTCGATGGCG 

GAAAGAGCAAGGCTAGCCAACATACCATTACCTGAAACAGCCTTGAAGTGTCCAAGATGT 

GACTCAACTAACACCAAATTCTGTTACTTCAAC^ 

TTCTGCAAAGCATGCCGTCGTTACTGGACACGTGGCGGTGCTCTAAGGAGCGTTCCCGTC 
GGTGGCGGTTGCCGTAGAAACAAAAGAACCAAAT^ACAGCAGCGGTGGAGGTGGCGGTAGC 
ACCAGTAGCGGTAACAGCAAGTCACAAGACAGCGCCACGAGGAACGACCAATACCACCAC 
CGAGCCATGGCTAACAATCAGATGGGACCACCTTCTTCGTCATCGTCTCTAAGCTCGTTG 
CTGTCTTCTTACAACGCAGGGTTAATCCCCGGACATGA^ 

ATACTTGGACTTGGATCATCTTTGCCTCCTCTTAAGCTTATGCCTCCTTTAGACTTCACA 
GACAACTTCACCTTACAATACGGTGCCGTTTCAGCTCCTTCTTATCATATAGGCGGTG 
AGCAGTGGAGGAGCGGCGGCTCTTTTAAACGGTTTTGACC^ 
AACCAACTTCCTTTAGGCGGTTTAGACCCGT^ 

AATCCAGGTTACGGATTGGTTACCGGGTCGGGTCAGTATCGACCTAAGAACATTTTCCAT 
AACCTTATCTCCTCTTCTTCGTCTGCTTCATCAGCTATGGTTACAGCCACCGCGTCGCAA 
TTAGCTTCAGTGAAAATGGAAGATAGTAACAATCAGCTCAACTTGTCTAGACAACTTTTT 
GGAGACGAACAACAGCTCTGGAATATTCATGGCGCTGCTGCAGCATCCACCGCAGCTGCA 
ACAAGTTCGTGGAGTGAAGTCTCTAATAATTTCAGTTCTTCTTCTACTAGCAATATATAA 

>G1882 Amino Acid Sequence (domain in AA coordinates : 97-125) 
MVFSSFPTYPDHSSNWQQQHQPITTTVGFTGl^INQQFLPHHPLPPQQQQTPPQLHHNNG 
NGGVAVPGGPGGLIRPGSMAERARLANIPLPETALKCPRC 

FCKACRRYWTRGGALRSVPVGGGCRRNKRTKNSSGGGGGSTSSGNSKSQDSATSNDQYHH 

RAMANNQMGPPSSSSSLSSLLSSYNAGLIPGHDHNSNNNNILGLGSSLPPLKLMPPLDFT 

DNFTLQYGAVSAPSYHIGGGSSGGAAALLNGFDQWRFPATNQLPLGGLDPFDQQHQMEQQ 

NPGYGLWGSGQYRPKNIFHNLISSSSSASSAMVTATASQLASVKMEDSNNQLNLSRQLF 

GDEQQLWN IHGAAAASTAAATS SWSEVSNNF S SS STSNI * 

>G1883 (1..1110) 

ATGGACGCTACGAAGTGGACACAGGGTTTTCAAGAAATGATGAACGTTAAACCAATGGAG 
CAGATC ATGATTCCTAATAACAACACACAT CAAC CAAACAC CACATCC AATGCAAGGC CA 
AACACCATTCTCACATCTAACGGCGTCTCAACTGCTGGAGCAACCGTCTCCGGCGTAAGC 
AACAACAATAACAATACGGCGGTTGTGGCGGAGAGGAAAGCAAGACCACAAGAGAAACTA 
AATTGTCCAAGATGCAACTCAACCAACACAAAGTTTTC 

ACACAACGAAGATACTTCTGCAAAGGTTGTCGAAGGTATTGGACCGAAGGTGGATCTCTT 
AGGAATGTTCCTGTGGGAGGAAGCTCAAGAAAGAACAAGAGATCATCTTCATCTTCTTCA 
TGAAACATCCTTCAGACAATACCATCTTCACTTCCAGATCTAAACCCGCCAATACTCTTC 
TOU^CCAAATCCATAATAAATCGAAAGGGTCATC^C^GATCTCAACTTGTTGTCTTTC 
CCAGTCATGCAAGATCAACATCATCATCATGTCCATATGTCTCAGTTTCTTCAGATGCCT 
AAGATGGAGGGAAATGGTAAC^TAACTCATCAGCAGCAGCCTTCATCATCTTCTTCTGTC 
TATGGTTCCTCGTCQTCTCCTGTTTCAGCTCTTGAACTTTTAAGAACCGGAGTTAATGTT 
TCTTCAAGATCAGGGATTAACTCATCGTTCATGCCTTCCGGTTCAATGATGGATTCAAAC 
ACTGTGCTTTACACTTCTTCAGGGTTTCCAACAATGGTGGATTACAAGCCAAGTAATCTC 
TCCTTCTCTACCGATCATCAAGGGCTTGGACACAATAGCAACAATAGGTCTGAAGCTCTT 
CATAGTGATCATCACCAACAAGGTAGAGTTTTGTTTCCATTTGGGGATCAAATGAAGGAG 
CTTTCATCAAGCATAACACAAGAAGTTGATCATGATGATAATCAACAACAGAAGAGTCAT 
GGAAATAATAATAATAATAATAACTCAAGCCCTAATAATGGATATTGGAGTGGGATGTTC 
AGTACTACAGGAGGAGGATCTTCATGGTGA 

>G1883 Amino Acid Sequence (domain in aa coordinates: 82-124) 
MDATKWTQGFQEMMNVKPMEQIMIPN1OTHQPNTTSNARPNTILT 
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NNNNNTAWAERKARPQEKLNCPRCNSTNTKFCYYNNYSLTQPRYFC 

RNVPVGGSSRKNKRSSSSSSSNILQTIPSSLPDLNPPILFSNQIHNKSKGSSQDLNLLSF 

PVMQDQHHHHVHM S QFLQMPKMEGNGN I THQQQPS S S S S VYG S S S S PVS ALELLRTG VNV 

SSRSGINSSFMPSGSMMDSNTVLYTSSGFPTMVDYKPSNLSFSTDHQGLGHNSN^ 

HSDHHQQGRVLFPFGDQMKBIiSS S I TQEVDHDDNQQQKSHGNNNNNNNS S PNNG YWSGMF 

STTGGGSSW* 

>G1884 (1..741) 

ATGATGACGTCATCCCA.TCAGAGCAACACCACCGGCTTTAAACCGCGGCGGATCAAGACG 
ACGGCGAAGCCACCACGTCAGATCAATAACAAAGAACCATCTCCGGCGACGCAGCCGGTG 
CTCAAGTGTCCGAGATGTGATTCAGTCAACACCAAAT^ 

TTGTCTCAGCCACGTCACTACTGCAAGAACTGTCGTCGTTACTGGACACGTGGCGGCGCC 

CTCCGTAACGTTCCCATCGGTGGCTCCACTCGAAACAAGAACAAGCCTTGCAGCCTCCAA 

GTCATCTCITCTCCTCOTTTGTTCTCGAACGGGACGTCATCGGCGTCTCGTGAGCTTGTA 

AGAAACCATCCATCGACGGCAATGATGATGATGAGTTCTGGTGGATTCTCCGGCTATATG 

TTTCCGTTGGATCCTAACTTCAACCTTGCCTCGTCTTCTATCGAGTCTTTGAGTTCTTTT 

AACG^GATTTGCACCAGAAGCTTCAGCAACAAAGACTCGTCACTTCCATG 

GATTCTCTTCCGGTTAACGAGAAAACGGTTATGTTTCAGAACGTAGAGTTGATTCCTCCT 

TCGACGGTGACGACGGATTGGGTTTTCGATAGGTTCGCCACTGGAGGAGGTGCAACAAGT 

GGCAATCATGAAGATAATGATGATGGGGAGGGTAATTTGGGAAATTGGTTCCATAATGCT 

AATAATAATGCTCTGCTCTAA 

>G1884 Amino Acid Sequence (domain in AA coordinates : 43 -71) 

MMTSSHQSNTTGFKPRRIKTTAKPPRQINttTKEPSPATQPVLKCPRCDS 

LSQPRHYCKNCRRYWTRGGALRNVP I GGSTRNKNKPCSIjQVI S S PPLF SNGTS S AS RELV 

RNHPSTAMMMMSSGGFSGYMFPLDPNFOT^ 

DSLPVNEKTVMFQNVELIPPSTVTTDVTW 

NNNALL* 

>G1891 (1..750) 

ATGGATAACTTGAATGTTTTCGCAAATGAAGACAATCAAGTGAATGATGTGAAGCCCCCA 

CCACCACCACCTCGAGTGTGTGCAAGGTGTGATTCTGATAATACTAAATTTTGTTATTAC 

AACAACTACTGTGAGTTTCAGCCACGATACTTCTGCAAGAACTGTCGT^ 

CATGGTGGGGCTTTAAGAAAC^TACCAATTGGTGGAAGTAGTCGTGCa^CGGGCAAGG 

GTAAATCAACCTTCGGTTGCTCGGATGGTTTCTGTTGAGACCCAACGAGGTAACAATCAA 

CCTTTCTCTAATGTTCAAGAAAACGTTCATCTTGTTGGATCTTTTGGTGCTTCATCTTCA 

TCTTCTGTTGGTGCTGTTGGGAACCTTTTTGGTTCTTTGTATGATATTCATGGTGGTATG 

GTAACAAATTTGC^TCCAACTCGAACTGTTCGACCAAATCATCGCTTAGCTTTCCA 

GGATCATTTGAGCAAGACTATTACGATGTTGGGTCCGATAATCTTTTGGTCAACCAACAA 

GTTGGTGGCTACGGTTATCACATGAATCCAGTGGATCAATTCAAGTGGAACCAGAGCTTC 

AACAACACTATGAACATGAATTATAATAACGATAGCACTAGTGGAAGTAGCAGAGGATCT 

GACATGAATGTGAACCATGATAACAAGAAGATCAGATACCGCAACTCTGTGATTATGCAT 

CCTTGTCATCTGGAGAAGGATGGTCCTTGA 

>G1891 Amino Acid Sequence (domain in aa coordinates: 27-69) 
MDNLNVFANEDNQ^ 

HGGALRNIPIGGSSRAKRARWQPSVARMVSVETQRGNNQPFSNVQENVHLVGSFGASSS 

SSVGAVGNLFGSLYDIHGGMVTNLHPTRTVRPNHRLAFHDGSFEQDYYD 

VGGYGYHMNPVDQFKWNQSFNNTMNMNYNNM 

PCHLEKDGP* 

>G1896 (1..951) 

ATGTCCTCCCATAGCAATCTCCCCTCTCCCAAACCAGTTCCTAAACCAGATCACCGTATC 

TCCGGTACATCCCAAACCAAGAAACCACCGTCTTCCTCCGTAGCTCAAGACCAACAAAAC 

CTAAAATGCCCTCGTTGCTVACTCTCCAAACACAAAGTTCTGTTACTACAACAACTA^ 

CTCTCTCAACCTCGTCACTTCTGCAAATCTTGTCGCCGTTACTGGACACGTGGCGGTGCT 

CTAAGAAACGTCCCCATCGGTGGTGGTTGCCGGAAAACCAAAAAATCTATCAAACCTAAT 

TCCTCC^TGAACACACTTCCTTCGTCTTC^ 

GAAGATTCATCCAAATTCTTCCCTCCTCCGACAACAATGGATTTTCAGCTGGCCGGATTA 
TCTCTCAACAAAATGAACGATCTTCAACTTTTGAATAACCAAGAAGTTCTTGATCTTAGG 
CCCATGATGTCCTCGGGCCGAGAAAACACACCCGTTGATGTCGGGTCGGGTTTATCCCTA 
ATGGGTTTTGGAGATTTCAACAACAACCATTCACCGACGGGGTTCACAACCGCCGGAGCA 
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AGCGACGGAAACTTAGCTTCTTCTATAGAGACITTGAGTTGTT^ 

TGGAGGCTTCAGCAACAGAGGATGGCGATGCTTTTTGGTAATTCTAAGGAAGAAACTGTT 

GTCGTCGAGAGGCC^CAACCTATTCTTTATCGGAATCTTGAGATCGTAAACTCATCATCG 

CCGTCGTCGCCGACGAAGAAAGGAGATAATCAGACAGAGTGGTATTTTGGTAATAACAGT 

GATAATGAAGGAGTGATTAGTAATAATGCTAATACAGGAGGAGGAGGAAGTGAATGGAAC 

AATGGAATTCAAGCTTGGACTGATCTTAATCATTATAATGCATTGCCTTGA 

>G1896 Amino Acid Sequence (domain in aa coordinates: 43-85) 

MSSHTNLPSPKPVPKPDHRISGTSQTKKPPSSSVAQD^ 

LSQPRHFCKSCIUiYWTRGGALRNVPIGGGCRKTKKSIKPNSSMNTLPSSSSSQRFFSSIM 
EDSSKFFPPPTTMDFQLAGLSLNKMNDLQLI^ 

MGFGDFNNNHS PTGFTTAGASDGNLAS S I ETLS CLNQDLHWRLQQQRMAMLFGNS KEETV 
WERPQPI LYRNLE I VNS S S PS S PTKKGDNQTEWYFGNNSDNEGVI SNNANTGGGGSEWN 
NGIQAWTDLNHYNALP* 
>G1898 (1..630) 

ATGCCGTCGGAACCAAACCAAACCCGACCCACCAGAGTTCAGCCCTCAACGGCGGCTTAC 

CCACCX3CCAAATCnX3GCTGAGCCTCTTC 

TTCTGTTACTACAACAACTATAACCTCGCTCAGCCT 

CGTTACTGGACTCAAGGTGGTACACTCCGTGACGTCCCCGTCGGTGGTGGAACTCGTCGA 
AGCTCCTCAAAACGTC^CCGTTCTTTC^ 

TCCGTCATCACCACCACGACACAAGAACCAGCCACGACTGAAGCGAGTCAAACTAAGGTT 
ACTAATTTAATTTCAGGTCATGGAAGCTTTGCTTCTCTGTTAGGTTTAGGAAGTGGAAAT 
GGTGGGTTGGATTACGGGTTTGGGTACGGGTACGGGCTTGAGGAGATGAGTATTGGGTAT 
CTTGGAGATTCTTCCGTAGGAGAGATTCCGGTGGTTGATGGTTGTGGTGGTGACACGTGG 
CAGATTGGGGAGATTGAAGGTAAAAGTGGAGGAGACAGTTTGATATGGCCTGGTCTTGAG 
ATCTCAATGCAAACCAACGATGTTAAGTGA 

>G1898 Amino Acid Sequence (domain in AA coordinates: 31-59) 
MPSEPNQTRPTRVQPSTAAYPPPNLAEPLPCPRCNSTTTK^ 

RYWTQGGTLRDVPVGGGTRRSSSKRHRSFSTTATSSSSSSSVITTTTQEPATTEASQTKV 
TNLISGHGSFASLLGLGSGNGGLDYGFGYGYGLEEMSIGYIiGDSSVGEIPWDGCGGDTW 
QIGEIEGKSGGDSLIWPGLEISMQTNDVK* 
>G1902 (1..615) 

ATGCAGGATCCAGCAGCATATTACCAGACGATGATGGCGAAGCAACAACT^ACAACAACAA 
CCACAGTTTGCAGAGCAAGAACAGTTAAAGTGTCCTCGTTGTGACTCACCAAACACTAAA 
TTCTGTTACTACAACAACTACAATC 

CGTTACTGGACTAAAGGCGGCGCTCTCCGTAACGTTCCCGTCGGTGGTGGTTCTCGTAAG 
AACGCAACCAAACGATCCACTTCTTCTT 

CAAAACAAGAAGACGAT^AAACCCGGATCCGGATCCTGATCCACGTAATTCTCAAAAACCG 
GATTTGGATCCGACCCGGATGCTTTACGGGTTTCCGATCGGTGACCAAGACGTGAAGGGT 
ATGGAGATTGGTGGAAGCTTTAGCTCGTTGTTGGCGAATAATATGCAGCTTGGTCTTGGA 
GGAGGAGGGATCATGCTTGACGGGTCGGGTTGGGATCATCCGGGTATGGGTTTGGGTTTG 
AGGAGAACCGAACCGGGTAATAATAATAATAACCCATGGACCGATCTGGCTATGAACAGA 

GCGGAGAAAAACTGA 

>G1902 Amino Acid Sequence (domain in AA coordinates : 31-59) 
MQDPAAYYQTMMAKQQQQQQPQFAEQEQLKCPRCDS PNTKFCYYNNYNLSQPRHFCKS CR 
RYWTKGGALRNVPVGGGSRKNATKRSTSSSSSASSPSNSSQNKKTKNPDPDPDPRNSQKP 
DLDPTRI^YGFPIGDQDVKGMEIGGSFSSLLAl^QLGLGGGGIMLDGSGWDHPGMGLGL 

RRTEPGNNNNNPWTDLAMNRAEKN * 
>G1904 (1..924* 

ATGCAAGATATTCATGATTTCTCCATGAACGGAGTTGGTGGTGGGGGAGGAGGAGGAGGG 
AGGTTTTTCGGTGGAGGAATCGGCGGCGGAGGAGGTGGTGATCGAAGGATGAGAGCTCAT 
CAGAACAATATACTTAACCATCATCAATCTCTCAAGTGTCCTCGTTGTAATTCTCTTAAC 
ACAAAGTTCTGTTACTACAACAATTACAATCTTTCT 

TGTCGTCGTTACTGGACTAAAGGTGGTGTTCTCCGTAACGTTCCCGTCGGAGGTGGTTGC 
CGGAAAGCTAAACGTTCGAAAACAAAACAGGTTCCGTCGTCGTCATCAGCCGACAAACCA 
ACX3ACGACGCAAGATGATCATCACGTGGAGGAGAAATCGAGTACAGGATCTCACTCTAGC 
AGCGAGAGCTCTTCTCTCACCGCTTCTAACTCTACCACCGTCGCCGCCGTCTCCGTCACC 
GCGGCGGCGGAAGTTGCTTCGTCGGTTATTCCAGGTTTTGATATGCCTAATATGAAAATT 
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TACGGTAACGGGATCGAGTGGTCGACGTTACTTGGACAAGGCTCATCGGCCGGTGGTGTT 
TTCTCGGAGATCGGTGGTTTTCCGGCGGTTTCAGCTATTGAAACTACACCGTTTGGATTC 
GGGGGTAAATTCGTAAATCAAGATGATCATCTGAAGTTAGAAGGTGAAACTGTACAGCAG 
CAACAGTTTGGAGATCGAACGGCTCAGGTTGAGTTTCAAGGAAGATCTTCGGATCCGAAT 
ATGGGATTTGAACCGTTGGATTGGGGAAGTGGCGGTGGAGATCAAACACTGTTTGATTTA 
ACCAGTACCGTTGATCATGCATACTGGAGTCAAAGTCAATGGACGTCGTCTGACCAAGAT 
CAGAGTGGTCTCTACCTTCCTTGA 

>G1904 Amino Acid Sequence (domain in aa coordinates: 53-95) 
MQDIHDFS^INGVGGGGGGGGRFFGGGIGGGGGGDRRMRAHQNNILNHHQSLKCPRCNSLN 

TKPCYYNNYNLSQPRHFCKNCRRY^ 

TTTQDDHHVEEKSSTGSHSSSESSSLTASNSTTVAAVSVTAAAEVASSVIPGFDMPNMKI 
YGNGIEWSTLLGQGSSAGGVFSEIGGFPAVSAIETTPFGFGGKFVNQDDHLKLEGETVQQ 
QQFGDRTAQVEFQGRSSDP^GFEPLDWGSGGGDQTLFDLTSTVDHAYWSQSQWTSSDQD 

QSGLYLP* 
>G1906 (1..795) 

ATGGTGGAACGTGCTCGGATCGCAAAAGTCCCATTGCCTGAAGCAGCTCTAAATTGCCCT 

AGATGTGACTCAACCAATACTAAGTTCTGTTACTTCAATAACTATAGCCTTACTCAACCT 

CGCCATTTCTGCAAAACATGTCGTCGCTATTGGACACGTGGCGGTTCCTTGAGGAATGTT 

CCTGTTGGAGGAGGCXTTAGGAGGAACAAGAGAAGCAAATCCAGATCGAAATCTACGGTC 

GTGGTCTCGACTGATAATACTACrAGTACTTCATCACTTACTTCTCGCCCA^ 

AACCCTAGCAAGTTTCATAGCTACGGTCAAATCCCGGAGTTTAATTCCAACTTGCCCATC 

TTGCCTCCTCTCCAAAGCCTTGGAGATTACAAT^ 

GGAACTCAAATAAGCAACATGATAAGTGGTATGAGTTCTAGTGGTGGGATCTTGGATGCA 
TGGAGAATACCTCCATCACAACAAGCTCAGCAATTCCCTTTCTTGATCAACACTACCGGA 
TTGGTGCAATCTTCAAACGCGTTATATCCAT^ 

ACAAGAAATGTGAAGGCGGAAGAGAATGATCAGGATCGGGGTAGGGATGGGGATGGAGTG 
AATAACTTATCAAGAAACTTTTTGGGTAATATCAACATAAACTCAGGCAGGT^ACGAGGAA 
TACACATCATGGGGAGGTAACAGTTCTTGGACCGGTTTCACCTCCAACAACTCAACAGGC 

CATCTCTCATTCTAA 

>G1906 Amino Acid Sequence (domain in AA coordinates : 19-47) 

MVERARIAKWLPEAALNCPRCDSTNTKFCYFNNYSLTQPRHFCKTCRRYWTRGG 

PVGGGFRRNKRSKSRSKSTVWSTDNTTSTSSLTSRPSYSNPSKFHSYGQIPEFNSNLPI 

IiPPLQSLGDYNSSNTGLDFGGTQISNMISGMSSSGGILiDAWRIPPSQQAQQFPFLINTTG 

LVQSSNALYPLLEGGVSATQTRNVKAEENDQ 

YTS WGGNS S WTGFTSNNS TGHLS F * 

>G1913 (1..744) 

ATGGAGAGAGCAGAGGCCTTGACATCATCGTTTATATGGCGGCCAAACGCAAACGCAAAC 
GCGGAGATCACGCCGAGTTGTCCAAGATGTGGATCCTCTAACACAAAGTTCTGTTACTAC 
AACAACTATAGCCTCACTCAGCCTCGCTACTTCTGCAAAGGCTGCCGCAGATATTGGACC 
AAAGGTGGTTCCCTCCGCAATGTTCCTGTAGGCGGTGGCTGTCGAAAATCCCGCCGCCCC 
AAATCATCTTCTGGTAACAATACTAAAACTAGCCTAACCGCTAATTCTGGCAACCCCGGT 
GGTGGTTCACCAAGCATCGATCTTGCTCTTGTTTACGCCAATTTCTTGAATCCAAAGCCT 
GACGAATCTATACTACAAGAAAATTGCGACTTAGCCACTACGGATTTTTTGGTAGATAAT 
CCTACCGGCACTTCCATGGACCCTTCATGGAGTATGGACATCAATGATGGTCATCATGAT 
CAXTATATTAATCCGGTGGAACACATTGTGGAGGAATGTGGTTATAATGGCTTGCCTCCA 
TTTCCTGGTGAAGAGCTTCTCTCTTTAGACACTAATGGTGTTTGGTCTGATGCTTTGTTG 
ATTGGTCATAACCATGTAGACGTTGGCGTGACTCCGGTTCAGGCTGTACACGAACCGGTG 
GTTCATTTCGCTGAeGAATCCAATGATTCCACCAATCTCTTGTTTGGAAGTTGGAGCCCT 
TTTGATTTCACTGCCGATGGATGA 

>G1913 Amino Acid Sequence (domain in AA coordinates: 27-55) 

MERAEALTSSFIWRPNANANAEITPSCPRCGSSNTKFCYYNNYSLTQPRYFCKGCRRYWT 

KGGSLRNVPVGGGCRKSRRPKSSSGNOTKTSLTANSGNPGGGSPSIDLALVYANFIiNPKP 

DESILQENCTIATTDFLVDNPTGTSMDPSWSMDINDGHHDHYINPVEHIVEECGYNGLPP 

FPGEELLSLDTNGVWSDALL I GHNHVDVGVTPVQAVHE P WHFADESNDSTNLLFGSWS P 

FDFTADG* 

>G1914 (1..945) 

ATGGAGAGATACAAGTGTAGATTTTGCTTCAAGAGCTTCATCAATGGAAGAGCTTTAGGT 
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GGTCACATGAGATCTCACATGCrTACTCTTTCTGCAGAACGTTGTGTAATAACTGGTGAA 

GCAGAAGAAGAAGTAGAGGAACGGCCGAGTCAACTCTGTGACGACGACGACGATACCGAG 

TCCGATGCTTCTTCTTCTTCTGGTGAGTTTGATAATCAAAAGATGAATCGTCTTGATGAT 

GAATTGGAGTTTGATTTCGCTGAAGACGACGACGTTGAAAGTGAAACCGAGTCGTCCAGG 

ATTAACCCT^CTCGGCGACGATCTAAACGMCTCGGAAACTTGGATCGTTTGATTTCGAC 

TTTGAGAAGCTAACAACGAGCCAACC CAGTG AGTTAGTGGCCGAG CCAGAGCATCACAG C 

TC^GCTTCTGATACAACT^CGGAGGAAGATCTCGCCTTTTGTCTCATTATGCTGT^ 

GACAAATGGAAGCAACAGAAGAAGAAGAAGCAACGTGTAGAAGAAGATGAGACAGATCAT 

GACAGTGAAGATTACAAATCAAGCAAGAGCAGAGGGAGATTCAAGTGTGAGACTTGTGGT 

AAAGTGTTTAAATCGTATCAAGCATTAGGAGGACACAGAGCAAGCCACAAGAAGAACAAG 

GCATGCATGACGAAAACAGAGCAAGTTGAAA(^GAGTACGTTCTTGGAGTAAAGGAGAAG 

AAAGTTCATGAATGTCCGATCTGTTTTAGGGTTTTTACTTCAGGGCAAGCACTTGGAGGT 

CATAAGAGATCTCACGGAAGTAACATCGGAGCAGGAAGAGGATTGTCAGTAAGTCAAATT 

GTCCAAATCGAAGAAGAAGTATCAGTGAAACAGAGGATGATTGATCTTAATCTTCCTGCA 

CCTAATGAAGAAGATGAAACTTCTTTGGTGTTTGATGAATGGTGA 

>G1914 Amino Acid Sequence (domain in AA coordinates : 195-216, 245-266] 
MERYKCRFCFKS F INGRAIjGGHMRSHMLTLSAERCVITGEAEEEVEERPSQLCDDDDDTB 
SDASSSSGEFDNQKMNRLDDELEFDFAEDDDVESETESSRINPTRRRSKRTRKLGSFDFD 
FEKLTTSQPSELVAEPEHHSSASDTTTEEDIiAFCLIMLSRDKTOQQKKKKQRVEEDETDH 
DSEDYKSSKSRGRFKCETCGKVFKSYQALGGHRASHKICNKACMT 

K\mECPICFRVFTSGQALGGHKRSHGSNIGAGRGLSVSQIVQIEEEVSVKQRMIDLNLPA 

PNEEDETSLVFDEW* 
>G1925 (1..945) 

ATGGAAGAAAATCTTCCTCCGGGGTTCAGATTTCATCCTACAGACGAGGAGCTCATAACG 
CATTATCTATGTCGGAAAGTCTCCGATATAGGATTCACCGGTAAAGCTGTCGTCGACGTT 
GATCTCAAC^AGTGTGAACCTTGGGATTTGCCA^^ 

TGGTATTTCTTCAGCCAAAGGGATCGGAAATATCCAACCGGTTTAAGAACAAACCGGGCA 

ACAGAAGCTGGTTACTGGAAAACCACCGGGAAAGATAAAGAAATATACCGAAGTGGAGTG 

TTGGTTGGGATGAAGAAAACCCTAGTTTTCTACAAAGGAAGAGCTCCCAAAGGTGAGAAA 

AGCAATTGGGTTATGC^TGAGTACAGGCTTGAGAGCAAACAACCTTTC^^C 

AAGGAGGAATGGGTAGTGTGTAGGGTTTTCGAAAAGAG CACGG CAGCAAAGAAAGCACAA 

GAACAACAACCTCAATCTTCTCAACCATCTTTTGGATCTCCATGCGATGCAAACTCATCA 

ATGGCAAATGAGTTTGAAGATATTGATGAGCTTCCGAATCTGAATTCAAACTCATCAACC 

ATCGATTACAATTU^TC^TATCCATCAATATTCGCAACGCAATGTTTAC 

ACAACAAGTACGGCTGGTCTCAACATGAACATGAACATGGCTAGTACTAATCnTCAGTC 

TGGACAACAAGTCTCCTTGGTCCGCCTT^ 

TTC(^AATCAGGAACTCTTATAGTTTCCCAAAAGAGATGATCCCCAGTTTCAATCATTCT 
TCTCTTCAACAAGGAGTCTCCAATATGATCCAAAATGCTTCAAGTTCGTCTCAAGTGCAA 
CCCCAACCGCAAGAGGAAGCGTTTAATATGGACTCCATATGGTGA 

>G1925 Amino Acid Sequence (conserved domain in AA coordinates : 6-150) 

^ENLPPGFRFHPTDEELITHYIiCRKVSDIGFTGKA 

WYFFSQRDRKYPTGLRTNRATEAGYWKTTGKDKEIYRSGV^ 

SNWWIHEYRLESKQPFNPTNKEEWWCRWEKSTAAKKAQEQQPQSSQPSFGSPCDANSS 
MANEFEDIDELPNLNSNSSTIDYNimiHQYSQRl!^ 

WTTSLLGPPLSPINSLLLKAFQIRNSYSFPKEMIPSFNHSSLQQGVSNMIQNASSSSQVQ 

PQPQEEAFNMDSIW* 
>G1929 (1..366) 

ATGTGTAGAGGCTT8AATAATGAAGAGAGCAGAAGAAGTGACGGAGGAGGTTGCCGGAGT 
CTCTGCACGAGACCGAGTGTTCCGGTAAGGTGTGAGCTTTGCGACGGAGACGCCTCCGTG 
TTCTGTGAAGCGGACTCGGCGTTCCTCTGTAGAAAATGTGACCGGTGGGTTCATGGAGCG 
AATTTTCTAGCTTGGAGACACGTAAGGCGCGTGCTATGCACTTCTTGTCAGAAACTCACG 
CGCCGGTGCCTCGTCGGAGATCATGACTTCCACGTTGTTTTACCGTCGGTGACGACGGTC 

CTCTGA 

>G1929 Amino Acid Sequence (domain in AA coordinates : 31-53) 
MCRGLNNEESRRSDGGGCRSLCTRPSVPVRCELCDGDASVFCEADSAFLCRKCDRWVHGA 
NFLAWRHVRRVLCTS CQKLTRRCLVGDHDFHVVLPS VTPTVGETTVENRS EQDNHEVPFVF 
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L* 

>G1930 (76.. 1077) 

ATTCACATTACTAATCTCTCAAGATTTCACAATTTTCTTGTGATTTTCTCT 

ATTTCGTTTCATAACATGGATGCCATGAGTAGCGTAGACGAGAGCTCTACAACTACAGAT 

TCCATTCCGGCGAGAAAGTCATCGTCTCCGGCGAGTTTACTATATAGAATGGGAAGCGGA 

ACAAGCGTGGTACTTGATTCAGAGAACGGTGTCGAAGTCGAAGTCGAAGCCGAATCTUVGA 

AAGCTTCCTTCTTCAAGATTCAAAGGTGTTGTTCCTCAACCAAATGGAAGATGGGGAGCT 

CAGATTTACGAGAAACATCAACGCGTGTGGCTTGGTACTTTCAACGAGGAAGACGAAGC^ 

GCTCGTGCTTACGACGTCGCGGCTCACCGTTTCCGTGGCCGCGATGCCGTTACTAATTTC 

AAAGACACGACGTTCGAAGAAGAGGTTGAGTTCTTAAACGCGCATTCGAAATCAGAGATC 

GTAGATATGTTGAGAAAACACACTTACAAAGAAGAGTTAGACCAAAGGAAACGTAACCGT 

GACGGTAACGGAAAAGAGACGACGGCGTTTGCTTTGGCTTCGATGGTGGTTATGACGGGG 

TTTAAAACGGCGGAGTTACTGTTTGAGAAAACGGTAACGCCAAGTGACGTCGGGAAACTA 

AACCGTTTAGTTATACCAAAACACGAAGCGGAGAAACATTTTCCGTTACCGTTAGGT 

AATAACGTCTCCGTTAAAGGTATGCTGTTGAATTTCGAAGACGTTAACGGGAAAGTGTGG 

AGGTTCCGTTACTCTTATTGGAATAGTAGTCAAAGTTATGTGTT GACCA AAGGTTGGAGT 

AGATTCGTTAAAGAGAAGAGACTTTGTGCTGGTGATTTGATCAGTTTTAAT^AGATCCAAC 

GATCAAGATCAAAAATTCTTTATCGGGTGGAAATCGAAATCCGGGTTGGATCTAGAGACG 

GGTCGGGTTATGAGATTGTTTGGGGTTGATATTTCTTTAAACGCCGTTC 

GAAACAACGGAGGTGTTAATGTCGTCGTTAAGGTGTAAGAAGCAACGAGTTTTGTAATAA 

CAATTTAACAACTTGGGAAAGAAAAAAAAGCTTTTTC 
ATCTTGCTGAGATTA 

>G1930 Amino Acid Sequence (domain in AA coordinates: 59-124) 
^AMSSVDESSTTTDSIPARKSSSPASLLYRMGSGTSVVLDSENGVEVEVEAESRKLPSS 

RFKGVVPQPNGRWGAQIYEKHQRVWLGTFNEEDEAARATO^ 

EEEVEFLNAHSKSEIVDMLRKHTYKEE^ 

LLFEKTVTPSDVGKLNRLVIPKHQAEKHFPLPLGNNNVSVKGMLLl^EDW 

YWNSS QSYVLTKGWSRFVKEKRLCAGDL I SFKRSNDQDQKFF IGWKSKS GLDLETGRVMR 

LFGVDISLNAVVVVKETTEVLMSSLRCKKQRVL* 

>G195 (51.. 1031) 

TTTTCTTTTTCTTTCTTTTTGGTTTAAGTTTTTTCTCTTTGTTCTTCGTCATGTCTCATG 

AAATCAAAGATCTTAACAACTATCACTACACTTCATCGTATAATC^TTACAATATCAA^ 

ACCAAAATATGATTAATCTCCCTTACGTTTCTGGTCCATCTGCTTATAATGCAAACATGA 

TCTCAT(^T(^C^GTAGGTTTTGATCTACCCTCGAAGAACTTGAGTCCTCAAGGAGCCT 

TCGAGTTGGGTTTCGAGCTTTCTCCATCTTCTTCTGACTTTTTTAATCCTTCCCTCGATC 

AAGAGAACGGTTTGTATAATGCTTATAATTATAATAGTAGTCAAAAGAGTCATGAAGTTG 

TCGGTGATGGTTGTGCAACCATTAAGAGTGAAGTTAGGGTTTCAGCATCTCCTTCTTCAA 

GTGAGGCCGATCATCATCCAGGAGAAGATTCCGGCAAGATCCGGAAGAAAAGAGAAGTTC 

GCGATGGAGGAGAAGATGATCAACGCTCTCAGAAAGTAGTTAAAAGAAAGAAGAAAGAGG 

AGAAGAAAAAAGAGCCACGAGTCTCGTTCATGACTAAGACCGAAGTTGATCATCTCGAAG 

ACGGCTATCGTTGGAGAAAGTATGGCCAAAAAGCAGTCAAAAACAGTCCTTATCCGAGGA 

GTTACTATAGATGCACGACTCAGAAGTGCAACGTGAAGAAGAGAGTGGAGAGATCTTACC 

AAGACCCAACGGTCGTCATCACAACCTACGAGAGTCAACACAACCATCCGATCCCGACCA 

ATCGTCGGACAGCAATGTTCTCTGGAACCACCGCATCTGATTATAACCCATCATCGTCTC 

C7VATATTCTCCGATCTCATCATGAATACTCCAAGAAGCTTCTCAAATGATGATCTCTTCC 

GTGTGCCATACGCTAGTGTGAACGTGAACCCTAGTTATCATCAACAGCAACATGGATTTC 

ATCAACAGGAGAGTGAGTTCGAGCTCTTGAAGGAGATGTTTCCTTCGGTTTTCTTCAAAC 

AAGAGCCTTGATGATATAATATAATATAGAAACAATTTTTTTTCTGCTAAGAAATATAGA 

ACAAAACTTGGATGCATAATAAGTGATGATAGTGTTATTTATTTTTTGCATGTATATATT 

ATACATGTTTTGTTAACTAGCTATAGGATATACTGGTAGTAATTAAGCATAAATATGGAG 

CCCTTCGACTTATTACAATAATTTTTGGTATGGA 

NNNTTNNGG 

>G195 Amino Acid Sequence (domain in AA coordinates: 183-239) 
MSHEIKDLNNYHYTSSYNHYNINNQNMINLPYVSGPS 

QGAFELGFELSPSSSDFFNPSIiDQENGLYNAYNYNSSQKSHEWGDGCATlKSEVRVSAS 
PSSSEADHHPGEDSGKIRKKREVRDGGEDDQRSQKVVKTKKKEEKKKEPRVSFMTKTEVD 
HLEDGYRWRK^GQKAVKNSPYPRSYYRCTTQKCWKKRVERSYQDPTWITTYESQHNHP 
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IPTNRRTAMFSGTTASDYNPSSSPIFSDLIINTPRSFSNDDLFRVPYASWVNPSYHQQQ 

HGFHQQESEFELLKEMFPSVFFKQEP* 

>G1954 (196.. 1440) 

ATTTATGACTTCTCAATACAAAAAGCTCCCCTCACTTTTTO 
CCGTCTTCTTCTACTATCTTGCATCTCTTGC^ 

AGCAAATCATACAAGGTCAAGAAGCTTGACCTTCATTAGACTTAAGCAGTTTATAATCAA 

CTACCACGAATAGCAATGGATAAAGATTACTCGGCACCAAACTTCTTAGGTGAATCCTCA 

GGCGGTAACGATGATAACAGCTCTGGTATGATAGACTATATGTTCAATAGAAACCTTCAA 

CAACAACAAAAGCAATCGATGCCACAACAGCAGCAACATCAACTCTCTCCTTCCGGAT^ 

GGAGCAACACCCTTTGATAAAATGAACTTCTCTGATGTGATGCAGTTTGCGGACTTCGGT 

TCGAAACTTGCGTTGAACCAGACCAGAAACCAAGACGATCAAGAAACCGGGATTGACCCC 

GTTTATTTCTTGAAGTTCCCTGTCTTGAACGACAAAATAGAGGACCATAACCAAACCCAA 

CATCTCATGCCTTCTC^TCAGACGTCTCAAGAAGGAGGTGAGTGTGGAGGAAACATAGGC 

AATGTGTTTCTTGAAGAAAAAGAAGATCAAGACGATGACAACGACAACAACTCCGTGCAA 

CTACGTTTTATTGGAGGAGAAGAAGAAGATAGGGAGAACAAGAATGTTACGAAAAAGGAG 

GTGAAGAGCAAGAGGAAGAGAGCTAGAACGAGCAAGACCAGCGAAGAAGTGGAAAGCCAA 

CGGATGACTCATATCGCGGTCGAAAGAAACCGTAGGAAGCAAATGAATGAGCATCTTCGT 

GTCCTTAGATCTCTCATGCCTGGCTCCTACGTTCAAAGGGGAGACCAAGCGTCAATCATA 

GGAGGAGCAATAGAGTTTGTGAGAGAGCTCGAGCAACTCCTACAATGTCTTGAATCACAG 

AAGCGTCGAAGAATCTTAGGAGAAACCGGTAGGGACATGACAACGACAACGACTTCTTCT 

TCTTCTCCCATAACTACGGTAGCGAACCAAGCACAACCGCTCATTATTACGGGAAATGTA 

ACCGAGCTAGAGGGCGGAGGAGGGCTTCGGGAGGAGACTGCGGAGAACAAGTCGTGCTTG 

GCTGACGTGGAGGTGAAGCTGCTAGGGTTTGACGCCATGATCAAGATACTTTCAAGAAGA 

AGGCCGGGACAGCTGATTAAGACTATAGCTGCTTTGGAGGATCTTCATCTCTCTATTCTT 

CACACTAACATCACTACCATGGAACAAACCGTCCTCTACTCCTTTAATGTCAAGATAACA 

AGTGAAACGAGGTTTACGGCAGAAGACATAGCAAGTTCCATCCAACAGATATTTAGTTTC 

ATTCATGCAAATACCAACATATCTGGAAGCTCTAACCTGGGAAATATTGTGTTTACTTGA 

AAATCATCACACGGCGACAACTTTGTACACTGGTGAAGATTACAGTACGTAATAATCTCT 

ACATATTGGGTTTTATTCTCCAAGCATTTGGAAGAGTGTTTAAGTTAAAGGGAGTGCTTA 

CTTTATTTTTTTGGGGCTTTTTTCATGCAATTTAAATTTTAGTGATGATTGTGTCGCTTG 

TAATGTTAGAACTCGTTGTTGTGATTTCTGCTGCTTTGATTTGTAGGTTTTGAACAAGCG 

GTTTAGAATGCTAAACCACTTATTTACTTGAAATAACTTTTTTCACAAAAAAAAAAAAAA 

AAGAAAAAAA 

>G1954 Amino Acid Sequence (domain in AA coordinates : 187-259) 
^KDYSAPNFLGESSGGITODNSSGMIDYMFNRNLQQQQKQSMPQQQQHQLSPSGFGATPF 

DKMNFSDVMQFADFGSKLALNQTRNQ^ 

HQTS QEGGECGGNIGJJVFIiEEKEDQDDDNDNNSVQLRF I GGEEEDRENKNVTKKEVKSKR 
KRARTSKTSEEVESQRMTHIAVERNRRKQMNEHLRVLRSLMPGSYVQRGDQASIIGGAIE 
FVRELEQLLQCLESQKRRRILGETGRDMTTTTTSSSSPITTVANQAQPLIITGNVTELEG 
GGGLREETAENKS CLADVEVKLLGFDAM IKILSRRRPGQL I KTI AALEDLHL S ILHTNI T 
TMEQTVLYSFNVKITSETRFTAEDIASSIQQIFSFIHANTNISGSSNLGNIVFT* 

>G1958 (107.. 1336) 

GTACCGTCGACCGATTATCCCCAAGAGGAGAATCCTCATAATCATTTTCTCCGATTCGAT 
TCGTCTTCCTTGGTCCTGGATTGCTTCATGAATTTCTAGGACAACAATGGAGGCTCGTCC 
AGTTCATAGATCAGGTTCGAGAGACCTCACACGCACTTCTTCAATCCCATCTACACAAAA 
ACCTTCACCAGTAGAAGATAGTTTCATGAGATCAGATAACAACAGTCAGTTAATGTCTAG 
ACCATTAGGACAAACCTACCATTTACTTTCATCTAGTAACGGTGGAGCTGTTGGACATAT 
ATGTTCTTCTTCAT6ATCTGGTTTTGCAACCAATCTCCATTACTCAACTATGGTATCTCA 
TGAGAAACAACAACACTACACAGGAAGCAGCAGTAATAATGCTGTGCAGACACCAAGCAA 
CAACGATAGTGCTTGGTGTCATGATTCATTGCCAGGAGGGTTTCTTGACTTCCATGAAAC 
CAACCCGGCGATTCAAAACAACTGTCAGATTGAGGATGGTGGCATTGCGGCTGCTTTTGA 
TGACATTCAAAAACGAAGTGATTGGCATGAATGGGCTGACCATTTGATCACTGATGATGA 
TCCTTTGATGTCTACTAACTGGAATGATCTCTTGCTTGAAACAAATTCCAATTCAGATTC 
AAAGGACCAGAAGACACTGCAAATTCCGCAACCTCAGATTGTTCAGCAGCAACCTTCTCC 
GTCTGTGGAATTGCGACCTGTTAGCACAACATCTTC^U^ACAGCAATAACGGAACGGGCAA 
GGCACGAATGCGTTGGACGCCAGAGCTTCACGAGGCTTTTGTTGAGGCTGTCAACAGTCT 
TGGCGGTAGTGAAAGAGCTACTCCTAAAGGGGTACTGAAGATTATGAAAGTTGAAGGCTT 
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GACTATATATCATGTTAAAAGCCATTTACAGAAATATAGGACAGCTAGATATCGGCCAGA 
ACCATCAGAAACTGGTTCGCCAGAAAGGAAGTTGACACCGCTTGAACATATAACATCTCT 
TGATTTGAAAGGTGGGATAGGTATTACAGAGGCTCTACGACTTCAGATGGAAGTACAGAA 
GCAACTCCATGAGCAGCTCGAGATTCAAAGAAACCTGCAACTCCGAATAGAAGAACAAGG 
CT^GTACCTGCAAATGATGTTCGAGAAGCAAAACTCTGGTCTTACCAAAGGGACAGCCTC 
AACATCAGATTCCGCAGCCAAATCTGAACAAGAAGACAAGAAGACTGCTGATTCGAAGGA 
GGTTCCAGAAGAAGAAACCAGGAAATGTGAGGAACTAGAATCTCCACAGCCAAAGCGTCC 
CAAAATCGATAATTGAAAGTATTGGTCTTITGCTGGATAATCTCGGAGTTTCAGAGTTAA 
CAGTGATAGAGAGAACGAGCTCTTATCTTGAGGTTCTTCAGGACTTCTCTCGCGGCCGCT 
CTAG 

>G1958 Amino Acid Sequence (domain in AA coordinates: 230-278) 
MEARPVHRSGSRDLTRTSSIPSTQKPSPVEDSFMRSDNNSQLMSRPLGQTYHLLSSSNGG 
AVGHI CS SS S SGFATNLHYSTMVSHEKQQHYTGS S SNNAVQTPSNND SAWCHDS LPGGFL 
DFHETNPAIQNNCQIE^GGIAAAFDDIQKRSDWHEWADHIilTDDDPLMSTNWND 

snsdskdqktlqipqpqivqqqpspsvelrpvsttssnsnngtgkarmrwtpelheafve 
avnslggseratpkgvlkimkvegltiyhvkshlqkyrtaryrpepsetgsperkltple 
hitsldlkggigitealrlqmevqkqlheql^ 
kgtastsdsaakseqedkktadskevpeeetrkceelespqpkrpkidn* 

>G196 (111. .1421) 

TCGACATCAGATTTCTCTCACGGATTCCTAATCATT 

ATTCTTCCCGTGTATAAATCTCATATAAACACGCATCATACATATATATTATGTGCAGCG 
TCTTTGAGTTTCAAGACATGGACAACTTCCAAGGAGATCTAACAGACGTCGTACGAGGA^ 
TAGGATCAGGCCACGTGTCACCATCTCCTGGACCACCGGAAGGTCCATCTCCGAGCAGCA 
TGTCTCCGCCGCCAACATCAGATCTCCACGTGGAATTCCCCTCCGCCGCTACTTCTGCCA 
GCTGTCTCGCAAATCCCTTCGGAGACCCGTTCGTAAGCATGAAGGATCCTCTCATCCACC 
TCCCGGCCAGCTACATCTCCGGCGCCGGTGATAATAAAAGCAACAAAAGTTTTGCAATCT 
TTCCAAAGATTTTTGAGGATGATCATATTAAGAGTCAATGCAGTGTCTTCCCAAGAATTA 
AGATCTCGCAAAGTAACAATATCCACGATGCCTCCACGTGTAATTCTCCGGCCATAACCG 
TCTCCTCTGCCGCCGTAGCAGCTTCGCCGTGGGGCATGATCAACGTTAATACCACTAACA 
GTCC^GAAACTGTTTACTTGTCGATAATAATAACAACACGTCATCATGCTCACAGGTTC 
AGATCTCTTCTTCCCCTCGGAATCTCGGAATTAAGAGAAGGAAGAGCCAGGCAAAGAAAG 
TGGTGTGCATACCGGCTCC^GCCGCTATGAACAGCCGGTCCAGTGGAGAAGTTGTTCCGT 
CTGATCTATGGGCTTGGCGAAAGTACGGTCAAAAACCTATCAAAGGTTCTCCTTATCCAA 
GGGGTTACTACAGATGTAGCAGCTCAAAAGGTTGTTCAGCTAGGAAACAAGTCGAACGTA 
GCCGCACTGATCCAAACATGTTAGTGATTACTTACACCTCTGAGCATAACCACCCATGGC 
CTACTCAACGCAACGCTCTCGCAGGTTCCACTCGTTCCTCTTCCTCCTCCTCTTTAAACC 
CTTCTTCCAAATCCTCAACCGCAGCCGCCACTACTTCTCCCTCATCCAGAGTTTTCCAAA 
ACAACAGCAGCAAAGACGAACCCAATAACTCCAACTTGCCTTCCTCTTCCACTCATCCTC 
CTTTTGACGCCGCCGCAATTAAGGAGGAGAACGTGGAAGAGCGTCAGGAAAAGATGGAGT 
TCGATTATAATGACGTTGAAAATACCTATAGACCGGAGTTGTTGCAAGAGTTTCAACATC 
AGCCGGAGGATTTCTTTGCCGATCTCGACGAGCTTGAGGGAGATTCTTTGACTATGTTGC 
TCTCTCACAGTAGCGGCGGAGGCAACATGGAAAACAAAACGACGATTCCAGACGTTTTTA 
GTGATTTCTTTGACGACGACGAGTCCTCAAGGTCGTTATAAATATTGTTGTTAATGTATA 
CATAGAAATGAAATTATTCATGTAATTCGTTTTGTGTTAAATGACGGTATTTGCCTTTGC 
A 

>G196 Amino Acid Sequence (conserved domain in AA coordinates : 223-283) 
MCSVFEFQDMDNFQGDLTDVVRGIGSGHVSPSPGPPEGPSPSSMSPPPTSDLHVEFPSAA 
TSASCLANPFGDPFVSMKDPLIHLPASYISGAGDNKSNKSFAIFPKIFEDDHIKSQCSVF 
PRIKISQSNNIHDASTCNSPAITVSSAAVAASPWGMINV^ 

S QVQI S S S PRl^GIKRRKSQAKKWCI PAPAAMNSRS SGE VVP SDLWAWRKYGQKP I KGS 

P YPRGYYRCS S S KGCS ARKQVERSRTDPNMLVITYTS EHNHPWPTQRNALAGSTRS S S S S 

SLNPSSKSST7UVATTSPSSRVFQNNSSKDEPNNSNLPSSSTHPPFDAAAIKEENVEERQE 

KMEFDYNDVENTYRPELLQEFQHQPEDFFADLDELEGDSLTMLLSHSSGGGNMENKTTIP 

DVFSDFFDDDESSRSL* 

>G1965 (1..609) 

ATGGATAACTTCAATGTTGTTGCCAATGAAGACAATCAAGTGAATGATGTGAAGCCTCCA 
CCACCCCCACCGCGAGTGTGTGC^GATGTGATTCTGATAACACAAAATTTTGTTACTAC 
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AACAATTATAGTGAGTTTCAACCGCGCTACTTCTGCAAGAACTGTCGAAGATACTGGACT 
CATGGTGGGGCTTTAAGAAACGTACCAATTGGTGGGAGTAGTCGTGCCAAGCGGACAAGG 
ATAAATCAACCTTCAGTTGCT(^GATGGTTTCTGTTGGAATCCAACC^GGGAACCGTm 

AGTTCTTTGTCTCATATTCATGGTGGTAT^ 

CGACCAAATCATCGCCTAGCTTTCCATAATGGATCATTTGAGCAAGATTATTATGATGTT 
GGGTCTGATAATCTTTTGGTAAACCAACAAGTTGGTGGATATGTTGATAATCACAACGGT 
TATCACATGAATCAAGTGGATCAATACAACTGGAACCAGAGCTTCAATAACGCTATGAAC 
ATGAATTATAATAACGCTAGCACTAGCGGAAGGATGCATCCTAGTCATTTAGAGAAGGGT 

GGTCCTTGA 

>G1965 Amino Acid Sequence (domain in AA coordinates : 27 -55) 

MDNFNWAllEDNQVNDVKPPPPPPRVCARCDSDNTKFCYYNNYSEFQPRyFCKNC 

HGGALRl^PIGGSSRAKRTRINQPSVAQMVSVGIQPGl^FSSIiSHIHGGMVTmTHPTQTF 

RPNHRLAFHNGS FEQD YYDVGSDNLLVNQQVGG YVDNHNG YHMNQVDQ YNWNQS FNNAMN 

MNYNNASTSGRMHPSHLEKGGP* 

>G1976 (1..1152) 

ATGACTGATCCITATTCCAATTTCTTCACAGACTGGTTCAAGTCTAATCCTTTTCACCAT 
TACCCTAATTCCTCC^CTAACCCCTCTCCTCATCCTCTTCCTCCTGTTACTCCTCCCTCT 
TCCTTCTTCTTCTTCCCTCAATCCGGAGACCTCCGCCGTCCACCGCCGCCACCAACTCCT 
CCTCCTTCTCCTCCTCTCCGAGAAGCCCTCCCTCTCCTCAGCCTCAGCCCCGCCAACAAA 
CAACAAGACCACCATCACAACCATGACCACCTTATTCAAGAACC^CCTTCAACCTCCATG 
GATGTCGACTACGATCATCACCATCAAGATGATCATCATAACCTCGATGACGATGACCAT 
GACGTCACCGTTGCTCTTCACATAGGCCTTCCAAGCCCTAGTGCTCAAGAGATGGCCTCT 
TTGCTCATGATGTCTTCTTCTTCCTCTTCCTCGAGGACCACTCATCATCACGAGGACATG 
AATCACAAGAAAGACCTCGACCATGAGTACAGCCACGGAGCTGTCGGAGGAGGAGAAGAT 
GACGATGAAGATTCAGTCGGCGGAGACGGCGGCTGTAGAATCAGCAGACTCAACAAGGGT 
CAATATTGjSATCCCTAC^CCTTCTCAGATTCTCATTGGCCCTACTCAGTTCTCATGTCCT 
GTTTGCTTCAAAACCTTCAACAGATACAATAACATGCAGATGCATATGTGGGGACATGGA 
TCACAATACAGAAAAGGACCTGAATCTCTAAGGGGAACACAACCAACAGGAATGCTAAGG 
CTTCCGTGCTATTGCTGCGCCCCAGGCTGTCGCAACAACATTGACCATCCAAGGGCAAAG 
CCTCTCAAAGACTTCAGAACCCTTCAAACACATTACAAGAGAAAACATGGGATCAAACCT 
TTCATGTGTAGGAAATGTGGAAAGGCTTTCGCAGTCCGAGGGGACTGGAGAACACATGAG 
AAGAATTGTGGCAAACTTTGGTATTGCATATGTGGATCTGATTTCAAGCACAAGAGATCT 
CTCAAAGATCACATCAAGGCTTTTGGGAATGGTCATGGAGCCTACGGAATTGATGGGTTT 
GATGAAGAAGATGAGCCTGCCTCTGAGGTAGAACAATTAGACAATGATCATGAGTCAATG 

CAGTCTAAATAG 

>G1976 Amino Acid Sequence (domain in AA coordinates: 219-323) 

MTDPYSNFFTDWFKSNPFHHYPNSSTNPSPHPLPPVTPPSSFFFFPQSGDIiRRPPPPPTP 

PPSPPLREALPLLSLSPANKQQDHHHl^HIiIQEPPSTSMDVDYDHHHQDDHHNLDDDDH 

DVTVALHIGLPS PS AQEMASLLMMS S S S S S SRTTHHHEDMNHKKDLDHE YSHGAVGGGED 

DDEDSVGGDGGCRISRLNKGQYWIPTPSQILIGPTQFSCPVCFKTFNRYNl^QMHMWGHG 

SQYRKGPESLRGTQPTGMLRLPCTCCAPGCRNNIDHPRAKPLKDFRTLQTHYKRKHGIKP 

FMCRKCGKAFAWGDWRTHEKNCGKLWYCICGSDFKHKRSLKDHIKAFGNGHGAYGIDGF 

DEEDEPASEVEQLDNDHESMQS K* 
>G2057 (27.. 1289) 

GCCGTCTCGACGAATATGCTCTACCAATGTCTGACGACCAATTCCATCACCCGCCGCCTC 
CTTCTTCAATGAGGCACCGTTCTACGTCGGATGCGGCGGACGGCGGCTGCGGCGAGATTG 
TTGAGGTGCAAGGTGGTCACATTGTTCGGTCTACCGGAAGAAAAGACCGCCACAGCAAAG 
TCTGCACGGCTAAAGGGCCACGTGACCGGCGCGTGAGACTCTCTGCTCACACGGCGATTC 
AGTTTTACGATGTTCAAGACAGGCTTGGTTTCGACCGACCTAGCAAAGCCGTTGATTGGC 
TTATCAAAAAGGCTAAGACTTCCATTGACGAGCTCGCTGAGCTTCCTCCCTGGAATCCCG 
CCGATGCAATTCGCCTAGCCGCTGCTAACGCTAAACCCAGAAGAACCACCGCCAAAACCC 
AAATCTCTCCGTCTCCGCCACCGCCGCAACAGCAACAACAACAACAACAGCTTCAGTTCG 
GTGTTGGCTTCAACGGAGGAGGAGCAGAGCATCCGAGTAACAACGAGTCGAGTTTTCTCC 
CGCCGTCAATGGATTCAGATTCGATAGCTGACACTATAAAGTCGTTTTTTCCGGTGATTG 
GCTCTTCAACGGAGGCTCCTTCGAATCATAACCTTATGCACAACTATCATCATCAGCATC 
CGCCGGATTTGCTTTCTCGAACTAATAGCCAAAACCAAGATCTCCGTCTCTCGCTGCAAT 
CGTTCCCGGATGGTCCACCGTCGCTTCTGCACCACCAACATCACCACCACACCTCTGCTT 
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CCGCCTCCGAGCCTACTCTGTTCTACGGACAGAGCAATCCGTTAGGGTTTGACACATCGA 

GTTGGGAGGAGCAGTCGTCGGAATTCGGAAGGATTCAGAGACTAGTGGCTTGGAACAGCG 

GCGGTGGCGGCGGAGCAACCGATACAGGAAACGGAGGAGGGTTTCTGTTCGCTCCTCCTA 

CTCCTTCAACGACGTCGTTTCAGCCAGTTCTTGGCCAAAGCCAACAG 

GGGGTCCCCT^CAGTCCAGTTACAGTCCCATGATCCGTGCTTGGTTTGATCCTCACCATC 

ATCACCAATCCATCTC^ 

ACC^TCAGCAATCCCCGGAATCGGATTCGCCTCAGGTGAATTCTCTTCGGGTTTTCGCA 
TACCAGCACGGTTTCAGGGCCAAGAAGAGGAGCAGCACGACGGTCTCACTCACAAGCCGT 
CCTCTGCTTCCTCTATTTCTCGCCATTGACAATCGAAACTAATCCTC 

>G2057 Amino Acid Sequence (domain in AA coordinates: TBD) 

MSDDQFHHPPPPSSMRHRSTSDAADGGCGEIVEVQGGHIVRSTGRKDRHSKVCTAKGPRD 

RRVRLSAHTAI QFYDVQDRLGFDRPS KAVDWL I KKAKTSIDELAELPPWNPADAIRLAAA 

NAKPRRTTAKTQISPSPPPPQQQQQQQQLQFGVGFNGGGAEHPSNNESSFLPPSMDSDSI 

ADTIKSFFPVIGSSTEAPSNHNLMHNyHHQHPPDLLSRTNSQNQDLRLSLQSFPDGPPSL 

LHHQHHHHTSASASEPTLFYGQSNPLGFDTSSWEQQSSEFGRIQRLVAWNSGGGGGATDT 

GNGGGFLFAPPTPSTTSFQPVLGQSQQLYSQRGPLQSSYSPMIRAWFDPHHHHQSISTDD 

LNHHHHLPPPVHQSAIPGIGFASGEFSSGFRIPARFQGQEEEQHDGLTHKPSSASSISRH 

* 

>G2107 (79.. 624) 

ACCACAAAACAGAGCAACACACAACACAAAGCTTC^ 

TTGAGAACCAGATCGGAGATGGAAAACGACGATATCACCGTGGCGGAGATGAAGCCAAAG 
AAGCGTGCTGGACGGAGGATTTTCAAGGAGACACGTCACCCAATCTAGAGAGGCGTGCGG 
CGTAGGGACGGCGACAAATGGGTATGCGAAGTCCGTGAACCGATTCATCAGCGTCGAGTC 
TGGCTCGGAACTTATCCGACGGCAGATATGGCCGCACGTGCTCACGACGTGGCGGTTCTT 
GCTCTGCGCGGGAGATCCGCGTGTTTGAATTTCTCCGATTCTGCTTGGAGGTTGCCGGTG 
CCGGCATCCACTGATCCGGACACGATCAGGCGCACGGCGGCCGAAGCAGCGGAGATGTTC 
AGGCCGCCGGAGTTTAGTACAGGAATTACGGTTTTACCCTCAGCCAGTGAGTTTGACACG 
TCGGATGAAGGAGTCGCTGGAATGATGATGAGGCTCGCGGAGGAGCCGTTGATGTCGCCG 
CCAAGATCGTACATTGATATGAATACGAGTGTGTACGTGGACGAAGAAATGTGTTACGAA 
GATTTGTCACTTTGGAGTTACTAAAATACGTATGTGTTAAAAAACCAAAGATCGTATGTG 
TATGTATGCATAATAAATGGGCTTAATGATGGGCATAGATATGATAGGTCCAGCCTATAT 
GTTAAATGTGTTTTATTTTTTGGTTTATCTAGTTTCCTAGGTATTTACCAAATTGTATTA 
GTATAAGTTTTATTAAGAAATAATCAAAAATGTTGTTGCCAAAAAAAAAAAAAAAAAAAA 

AAAAA 

>G2107 Amino Acid Sequence (domain in AA coordinates: TBD) 
MENDDITVAEMKPKKRAGRRIFKETRHPIYRGVRRRDGDKWVCEVRE 
TADMAARAHDVAVLALRGRSACLNFSDSAWRLPVPASTDPDTIRRTAAEAAEMFRPPEFS 
TGITVLPSASEFDTSDEGVAGMMMRIiAEEPLMSPPRSYIDMNTSVYTOEEMCYEDLSLWS 

Y* 

>G211 (1..750) 

ATGATGTCATGTGGTGGGAAGAAGCCAGTGTCTAAGAAAACAACGCCGTGTTGCACGAAG 
ATGGGGATGAAGAGAGGACCATGGACGGTGGAGGAAGACGAGATTCTTGTGAGCTTCATT 
AAGAAAGAAGGTGAAGGACGGTGGCGATCGCTTCCTAAGAGAGCTGGTTTACTCAGATGT 
GGAAAGAGCTGTCGTCTACGGTGGATGAACTATCTCCGACCCTCGGTTAAACGTGGAGGA 
ATTACGTCGGACGAGGAAGATCTCATCCTCCGTCTTCACCGCCTCCTCGGCAACAGGTGG 
TCATTGATCGCGGGAAGGATACCGGGAAGGACTGATAATGAAATTAAGAACTATTGGAAC 
ACTCATCTTCGTAAGAAACTTTTAAGGCAAGGAATTGATCCTCAAACCCACAAGCCTCTT 
GATGCAAACAACATeCATAAACCAGAAGAAGAAGTTTCCGGTGGACAAAAGTACCCTCTA 
GAGCCTATTTCTAGTTCTCATACTGATGATACCACTGTTAATGGCGGGGATGGAGATAGC 
AAGAACAGTATCAATGTCTTTGGTGGTGAACACGGCTACGAAGACTTTGGTTTCTGCTAC 
GACGACAAGTTCTCATCGTTTCTTAATTCGCTCATCAACGATGTTGGTGATCCTTTTGGT 
AATATTATCCCAATATCTCAACCTTTGCAGATGGATGATTGTAAGGATGGGATTGTTGGA 
GCGTCGTCTTCTAGCTTAGGACATGACTAG 

>G211 Amino Acid Sequence (conserved domain in AA coordinates : 24-137) 
MMS CGGKKPVS KKTTPCCTKMGMKRGPWTVEEDE ILVS F I KKEGEGRWRSLPKRAGLLRC 
GKSdRLRWMNYLRPSVKRGGITSDEEDLILRLHRLLGNRWSLIAGRI PGRTDNEI KNYWN 
THLRKKLLRQG I DPQTHKPLDANNIHKPEEE VSGGQKYPLEP I S S SHTDDTTVNGGDGDS 
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KNSINVFGGEHGYEDFGFCTODKFSSFLNSLINDVGDPFGNIIPISQPLQMDDCKDGIVG 

ASSSSLGHD* 
>G2133 (26.-457) 

ATCTCATCTTCATCCACCCAAAAACATGGATTCT^AGAGACACCGGAGAAACTGACCAGAG 
CAAGTACAAAGGTATCCGTCGTCGGAAATGGGGAAAATGGGTATCAGAGATTCGTGTCCC 
GGGAACTCGTCAACGTCTCTGGTTAGGCTCTTTCTCCACCGCAGAAGGCGCTGCCGTAGC 
CCACGACGTCGCTTTTTACTGCTTGCACCGACCATCTTCCCTCGACGACGAATCTTTTAA 
CTTCCCTCACTTACTTACAACCTCCCTCGCCTCCAATATATCTCCTAAGTCCATCCAAAA 
AGCTGCTTCCGACGCCGGCATGGCCGTGGACGCCGGATTCCATGGTGCTGTGTCTGGGAG 
TGGTGGTTGTGAAGAGAGATCTTCCATGGCGAATATGGAGGAGGAGGACAAACTTAGTAT 
CTCCGTGTATGATTATCTTGAAGACGATCTCGTTTGATCTATACGAGTACGTTTTTAGCA 

GTTAA 

>G2133 Amino Acid Sequence (domain in AA coordinates : 11-83) 
OTSRDTGETDQSKYKGIRRRKWGKWVS 

HRPSSLDDESFNFPHLLTTSLASNISPKSIQKAASDAGMAVDAGFHGAVSGSGGCEERSS 
MANMEEEDKLSISVYDYLEDDLV* 

>G2134 (36.. 644) CTTAGGAATTCCGGTA 
ACAGCGACAAAGCGCAAAACGATGGCAAAGGTGTACCATCTGCCTACAGAGGAGTCCGGA 
AGAGAAAATGGGGGAAATGGGTGTCTGAAATCCGTGAACCGGGGACCAAGAACCGTATCT 
GGCTAGGCAGTTTCGAGACTCCTGAAATGGCTGCAACCGCATACGACGTGGCAGCATTTC 
ATTTCAGAGGGAGAGAAGCTCGTCTCAACTTCCCTGAGCTCGCCAGCAGCCTTCCACGTC 
CTGCAGACTCTAGCTCAGAC^GCATTCGCATGGCAGTTCATGAGGCAACACTCTGCCGCA 
CCACCGAAGGAACAGAGTCAGCCATGCAAGTGGACAGCTCAAGCTCCTCCAATGTAGCTC 
CAACAATGGTCAGACTCTCGCCCAGGGAAATTC^GCGATa^CGAGTCAACTTTGGGAT 
CTCCTACTACAATGATGCATTCAACATACGACCCTATGGAGTTTGCTAATGATGTGGAGA 
TGAATGCTTGGGAAACATACCAGAGTGACTTTCTTTGGGACCCTTAACCCCAAAACCTAA 
CTCATGGAGAGCTTCTACAGCTCAATCTTACAATACCAGCATAAGTTACTGGCTTAGAAT 
ACTTAAATTTATTGAAGTTTAGTTTTCAGAGTCTACCACAAGGGTTGTTGATTCTGACGT 
TATAGCAAAGAATAAAGCTCATCAGATTTTGGAGGGAAAGACTCTATGAGCTTGATGGGT 
CCCTGAAAGGACCTCTTCACAAATATTTTTAAATTTTTTTGTTACTAGTAGAAACATAGA 
TTATGAGGTGTGACTTATTATTATTTTTTACAATTGTTTGTTACCTCATTGATGTATTTG 

ATTT 

>G2134 Amino Acid Sequence (domain in AA coordinates: TBD) 
MAGLRfcTSGNSDKAQNDGKGVPSAYRGVRKRKWGKJ^ 

TAYDVAAFHFRGREARLNFPELAS S LPRPADSS SDS IRMAVHEATLCRTTEGTES AMQVD 
SSSSSNVAPTMVRLSPRBIQAINESTLGSPTTMMHSTYD^ 
WDP* PQNLTHGELLQLNLTIPA* 
>G2151 (236.. 1321) 

TTTTTTTTTTAGGGTTCATAAGAACAAATTGGATTTTGAGCTCACAGTATAAATAACCCG 

ACTTTGATTACTGGGTAATTTTAAAACCGCCATTGTTGTTCTCTTTACTACTTTTGGGAA 

TTAGGGTTTATGATTTCTGGGTATTAGATTAGATAAATTTGTTTCCTTTTTTTGTTAATC 

AATTTAAAAATCTCTTATTTCTGTTAAAGACTTGTAATTTTGGAGTTTTTAATGCAT 

CGGAAGAGAAGCAATGGCATTTCCAGGCTCGCATTCTCAGTACTATCTTCAAAGAGGAGC 

CTTTACTT^ATCTCGCACCTTCCCAAGTCGCGAGTGGGCTTCACGCGCCGCCGCCACATAC 

GGGATTGAGGCCAATGTCTAACCCTAACATTCATCACCCTCAGGCTAACAATCCAGGACC 

TCCTTTCTCGGATTTTGGACACACCATTCACATGGGAGTGGTCTCCTCTGCTTCTGATGC 

TGATGTGCAACCGCCACCGCCACCGCCACCACCAGAGGAACCGATGGTTAAGAGGAAACG 

TGGACGGCCAAGAAAGTATGGAGAACCGATGGTTAGTAATAAGTCTAGGGACTCTTCTCC 

AATGTCTGATCCTAATGAACCTAAACGGGCCAGAGGTCGACCTCCTGGAACTGGAAGGAA 

GCAACGCTTGGCTAATCTTGGTGAGTGGATGAATACTTCAGCTGGACTTGCTTTTGCACC 

TCATGTGATCAGCATTGGAGCAGGAGAAGACATTGCTGCGAAAGTTTTGTCATTTTCACA 

ACAAAGACCTCGGGCTCTTTGTATAATGTCAGGCACTGGAACCATTTCTTCAGTCACTCT 

GTGCAAACCCGGTTCAACCGATCGTCACTTAACATACGAGGGACCTTTTGAGATTATAAG 

TTTTGGTGGATCTTATTTGGTGAATGAAGAAGGTGGATCCAGAAGTCGAACAGGCGGATT 

GAGTGTCTCTCTTTCTCGTCCCGATGGTAGTATTATTGCCGGTGGAGTTGACATGCTTAT 

CGCAGCCAACCTTGTTCAGGTGGTGGCATGTAGTTTTGTATACGGAGCAAGGGCAAAGAC 
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TCATAATAACAATAACAAGACCATCAGACAAGAAAAGGAACCAAATGAAGAGGACAACAA 
TAGTGAAATGGAGACCACACCGGGTAGTGCAGCTGAACCAGCAGCATCTGCGGGTCAGCA 
GACGCCACAGAACTTCTCTTCTCAGGGAATAAGGGGGTGGCCCGGTTCAGGCTCAGGCTC 
TGGCAGATCACTTGACATTTGCAGAAACCCACTCACTGATTTTG 

ATATACACTATTAGTCTTTGAAGCAGCAGCATACAAAATGTGATTGCTGTACATATGTTA 

TTGTAGATTTCTCTCTGGGAATGTTGAAATCAGACATTTAAGGATTGATACTAGATCTCT 

CAGCTCCTTCTAACATTGTTAATGTAACAGAACCCTCCCACTTTCATGCTATTTGC 

>G2151 Amino Acid Sequence (domain in AA coordinates : 93-113 , 124-144) 

MDGREAMAFPGSHS QYYLQRGAFTNLAPS QVASGLHAPPPHTGLRPMSNPNIHHPQANNP 

GPPFSDFGHTIHMGWSSASDADVQPPPPPPPPEEPMVKRKRGRPRKYGEPMVSNKSRDS 

SPMSDPNEPKRARGRPPGTGRKQRLANLGEWMNTSAGLAFAPHVISIGAGEDIAAKVLSF 

SQQRPRALCIMSGTGTISSVTLCKPGSTDRHLTYEGPFEIISFGGSYLVNEEGGSRSRTG 

GLSVSLSRPIXBSIIAGGVDMLIAANLVQWACSFVYGARAKTHNN^ 

l^SEMETTPGSAAEPAASAGQQTPQNFSSQGIRGWPGSGSGSGRSLDICRNPLTDFDLTO 

G* 

>G2154 (82.. 1317) 

GC7VAAAAGAAAAAATGAAAAAAAATCCCTAACTCTCTCTCTCTAGAAATTCTTATTTTTG 
TGCGTATCTCTCTAAAAAGGAATGGATCCTAACGAAAGCCACCATCACCACCAACAACAA 
CAGCTCCATCACCTCCACCAACAGCAACAGCAACAGCAGCAGCAGCAACGACTCACTTCT 
CCTTACTTCCACCACCAACTAC^GCACCATCACCACCTTCCAACCACCGTAGa^CCACC 
GCTTCTACCGGAAACGCCGTTCCATCTTCCAACAATGGGCTTTTCCCTCCGCAGCCTCAG 
CCACAGCACC^GCCTAATGATGGGTCATCTTCTCTCGCGGTGTACCCTCATTCAGTTCCG 
TCCTCGGCTGTGACGGCGCCGATGGAGCCGGTAAAGAGGAAGAGGGGTCGACCAAGAAAG 
TATGTGACGCCGGAACAAGCCCTAGCGGCTAAGAAATTGGCGTCTTCTGCGAGTAGTTCG 
TCTGCTAAACAGAGGCGAGAGCTTGCTGCTGTTACCGGTGGTACGGTATCGACTAATTCC 
GGGT(^TCCAAGAAATCTCAGCTTGGTTCTGTCGGGAAAACTGGAC^TGTTTTACTCCG 
CATATTGTTAATATAGCTCCTGGCGAGGATGTGGTCCAGAAAATTATGATGTTCGCAAAC 
O^GC^GCATGAACTATGCGTTCTTTCTGCATCAGGCACTATCTCTAATGC^TCCTTC 
CGCCAACCGGCTCCATCAGGAGGCAACTTACCATATGAGGGTCAATACGAGATTCTCTCA 
CTATCTGGATCCTATATCCGAACTGAACAAGGTGGTAAATCCGGCGGCCTTAGCGTTTCT 
TTATCTGCTTCAGATGGTCAGATCATCGGTGGAGCGATTGGTAGCCATCTCACAGCTGCT 
GGCCCGGTTCAGGTGATTCTTGGTACGTTTCAGCTTGATAGAAAGAAGGATGCCGCCGGG 
AGTGGTGGGAAAGGGGATGCTTCAAACAGTGGAAGTCGGTTAACTTCTCCTGTAAGCTCT 
GGACAGTTGCTTGGCATGGGTTTCCCTCCTGGTATGGAATCTACGGGAAGAAATCCAATG 
AGGGGAAACGACGAGCAACATGATCATCATCATCATCAAGCCGGTTTGGGTGGACCTCAT 
CATTTCATGATGCAAGCGCCGCAGGGGATACACATGACACATTCCAGGCCATCTGAATGG 
CGCGGAGGAGGCAACAGCGGTCATGATGGCAGAGGCGGTGGCGGGTATGATTTGTCAGGA 
AGGATAGGACATGAGTCGTCGGAGAATGGAGATTACGAGCAGCAAATACCGGATTAGCAG 
AGCTTCCAGGAGAAGTGTGTAGAGTTTAGATCCCAAGTAGAGAAACAGAAGGCGAGCAAA 
GAATCTGAACTGAGAGAGGACTTATTAGACAGAGACTCGTCTGAAGGGTCTTTAATCATA 
GA7UVGAAGTTGCTGAGTGATTGCTTTTGTTCTTCTTCTTGGTACGGTGTATTATATTAAC 
TCCACAACCTTTTTTTTATACTTTGAGTAA^^ 

TTTTTTTATACTCTTTTTCTTTTCTTATAATATTTTTTTTGGTTTTTCTTTCGTTTGTTA 

CTAAAAAAGGAAATGCTCTTTTTGTGAAATATATACACTTCGTTTG 

>G2154 Amino Acid Sequence (domain in AA coordinates : 97-119) 

^PNESHHHHQQQQLHHLHQQQQQQQQQQRLTSPYFHHQLQHHHHLPTTVATTASTGNAV 

PSSNNGLFPPQPQPQHQPNDGSSSLAVYPHSVPSSAWAPMEPVKRKRGRPRKYVTPEQA 

LAAKKLASSASSSSIOCQRRELAAVTGGTVSTNSGSSKKSQLGSVGKTGQCFTPHIVNIAP 

GEDWQKII^FANQSKHELCVLSASGTISNASLRQPAPSGGNLPYEGQYEILSLSGSYIR 

TEQGGKSGGLSVSLSASDGQIIGGAIGSHLTAAGPVQVILGTFQLDRKKDAAGSGGKGDA 

SNSGSRLTSPVSSGQLLGMGFPPGMESTGRNPMRGNDEQHDHHHHQAGLGGPHHFMMQAP 

QGIHMTHSRPSEWRGGGNSGHDGRGGGGYDLSGRIGHESSENGDYEQQIPD* 

>G2157 (306.. 1238) 

TCTTTTGATTTTAACCTTTTTTCAGTAGCAAGCCAAAAAAAAAAAACAGACAAAG 
CCTTTTATGATAAAGGTATGATGATAGCAAACAAATGATACCCCCATGTCTTGTGTGTCT 
GCTTCATGCAACATGTTGGTTTGGATTTGGTTAATCTAAAAGTTTAAGATAAGGTTTTCG 
GATTCTCTTCCTGTCTTGTAATAGTTTCTTGTCGGAGAGCCATCAACACO^CTTCAACA 
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AAAAAAACAAGAAAAAGAAAAAGATTCTCTTTCTCGTTTTATTTCCATTAGAGAAGAAAA 

AAAGAATGGCGAATCCTTGGTGGGTAGGGAATGTTGCGATCGGTGGAGTTGAGAGTCCAG 

TGACGTCATCAGCTCCTTCTTTGCACCACAGAAACAGTAACAACAACAACCCACCGACTA 

TGACTCGTTCGGATCCAAGATTGGACCATGACTTCACCACCAACAACAGTGGAAGCCCTA 

ATACCCAGACTCAGAGCCAAGAAGAACAGAACAGCAGAGACGAGCAACCAGCTGTTGAAC 

CCGGATCCGGATCCGGGTCTACGGGTCGTCGTCCTAGAGGTAGACCTCCTGGTTCCAAGA 

ACAAAC CAAAGAGTCCAGTTGTTGTTAC CAAAGAAAGCCCTAACTCTCTCCAGAGCCATG 

TTCTTGAGATTGCTACGGGAGCTGACGTGGCGGAAAGCTTAAACGCCTTTGCTCGTAGAC 

GCGGCCGGGGCGTTTCGGTGCTGAGCGGTAGTGGTTTGGTTACTAATGTTACTCTGCGTC 

AGCCTGCTGCATCCGGTGGAGTTGTTAGTTTACGTGGTCAGTTTGAGATCTTGTCTATGT 

GTGGGGCTTTTCITCCTACGTCTGGCTCTCCTGCTGCAGCCGCTGGTTTAACCATTTACT 

TAGCTGGAGCTCAAGGTCAAGTTGTGGGAGGTGGAGTTGCTGGCCCGCTTATTGCCTCTG 

GACCCGTTATTGTGATAGCTGCTACGTTTTGCAATGCCACTTATGAGAGGTTACCGATTG 

AGGAAGAACAACAGCAAGAGCAGCCGCTTCAACTAGAAGATGGGAAGAAGCAGAAAGAAG 

AGAATGATGATAACGAGAGTGGGAATAACGGAAACGAAGGATCGATGCAGCCGCCGATGT 

ATAATATGCCTCCTAATTTTATCCCAAATGGTCATCAAATGGCTCAACACGACGTGTATT 

GGGGTGGTCCTCCGCCTCGTGCTCCTCCETCGTATTGATTAGTTAGATAGGCGGTGGTTG 

GTGCGTTCTTTTTACTGGAATGATTATATTTTCCATTAGGATGGTTAGGCT 

TAAAGCTATCAAGTTTCTTTTTTTTTTACGGATAATTCGGATGACAATTAGCTAGTGTTT 

GTTTGTTTGTTTTGTGGCGGCTTTTCTGACT^ 
TGAAAGTGAATTGATTGTAGAATCGTCTTTTGAAT^ 

>G2157 Amino Acid Sequence (domain in AA coordinates: 82-102, 164-107) 
MANPWWGNVAIGGVESPVTSSAPSLHHRN 

QTQSQEEQNSRDEQPAVEPGSGSGSTGRRPRGRPPGSKNKPKSPVVVTKESPNSLQSHVL 
EIATGADVAESLNAFARRRGRGVSVLSGSGLVTNVTLRQPAASGGVVSLRGQFEILSMCG 
AFLPTSGS PAAAAGLTI YLAGAQGQ WGGG VAGPL IASGPVT VIAATFCNATYERLP I EE 
EQQQEQPLQLEDGKKQKEETODNESGNNGNEGSMQPPMYl^PPNFIPNGHQMAQHDVYWG 

GPPPRAPPSY* 
>G2181 (1..1005) 

ATGATGCTTGCGGTGGAAGATGTGTTAAGCGAACTCGCCGGAGAAGAAAGGAACGAGAGA 
GGATTGCCACCTGGCTTCCGGTTTCACCCGACGGACGAAGAGCTCATTACCTTCTACTTA 
GCTTCCAAAATCTTCCATGGTGGTCTCTCCGGCATTCACATTTCCGAAGTTGATCTCAAC 
CGCTGTGAACCTTGGGAGCTACCAGAAATGGCGAAGATGGGAGAGAGAGAGTGGTACTTT 
TATAGTCTAAGGGACAGGAAATATCCGACAGGTTTGAGGACTAACAGAGCAACTACTGCT 
GGATACTGGAAAGCTACCGGCAAAGATAAGGAAGTCTTCTCCGGCGGAGGAGGACAGCTT 
GTTGGGATGAAGAAGACGTTGGTGTTCTACAAAGGTAGGGCTCCACGTGGCCTCAAGACT 
AAGTGGGTCATGCATGAGTATCGCCTCGAAAACGACCATTCACACCGCCACACGTGTAAG 
GAGGAATGGGTGATTTGCAGAGTGTTCAATAAAACAGGAGACAGAAAAAATGTTGGATTA 
ATCCATAACCAAATCAGCTACCTTCATAACCATTCACTCTCAACAACACATCATCATCAT 
CATGAAGCCTTACCTTTGCTTATAGAACCTTCCAACAAAACCCTAACCAACTTCCCATCA 
CTACTCTACGATGATCCACACCAAAACTACAATAATAACAACTTCCTTCATGGATCATCA 
GGCCACAACATCGACGAGCTCAAAGCCTTAATCAACCCTGTCGTCTCTCAGCTCAACGGT 
ATCATCTTTCCTTCAGGGAACAACAACAACGACGAAGACGACTTCGACTTTAACCTCGGC 
GTGAAAACAGAGCAGTCTTCGAACGGTAACGAAATTGACGTACGAGATTACTTGGAGAAC 
CCTCTGTTTCAGGAAGCGAGTTATGGTCTGTTGGGTTTTTCGTCTTCTCCTGGACCTCTT 
CACATGCTACTAGATTCTCCATGTCCTTTAGGATTCCAGCTGTAG 

>G2181 Amino Acid Sequence (conserved domain in AA coordinates : 22-169) 

MMLAVEDVLSELAGEERl^RGLPPGFRFHPTOEELITFYLASKIFHGGLSGIHISEVDLN 

RCEPWELPEMAKMGEREWYFYSLRDRKYPTGLRTNRATTAGYWKATGKDKEVFSGGGGQL 

VGMKKTLVF'YKGRAPRGLKTKWVI^EYRLENDH^ 

IHNQISYLHNHSLSTTHHHHHEALPLLIEPSNOT^ 

GHNIDELKALINPWSQLNGIIFPSGNNI^Era 

PLFQEASYGLLGFSSSPGPLHMLLDSPCPLGFQL* 

>G221 (115.. 795) 

CTCTCTTATTCTCTCACTCTTTTTTTTTTATATTCCTCTCTCTCTAAATCTATAAAATAT 
ATTTAAAAACTTGATCGTATATAATAAAGTAAATAAAG7ATAATAACAAAAAAAATGGAG 
AAAAGAGGAGGAGGAAGTAGTGGAGGTTCGGGATCATCAGCAGAAGCAGAAGTGAGAAAA 
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GGACCATGGACGATGGAAGAAGATCTTATTCTTATCAACTATATCGCCAACCACGGCGAT 
GGTGTTTGGAATTCTCTCGCCAAATCTGCAGGTCTAAAACGAACCGGGAAAAGTTGCCGG 
CTCCGGTGGCTGAACTATCTCCGCCCCGACGTACGACGGGGAAACATCACTCCAGAAGAG 
CAACTTATCATCATGGAACTTCATGCTAAGTGGGGAAACAGGTGGTCGAAAATCGCCAAA 
CATCTTCCAGGAAGAACGGACAACGAGATCAAAAATTTCTGTAGGACAAGAATTCAAA^ 
TACATCAAGCAATCGGATGTAACAACAACATCGTCCGTTGGATCTCATCATAGCTCAGAG 
ATCAACGATCAAGCTGCAAGCACGTCGAGCCATAATGTCTTTTGTACACAAGATCAAGCG 
ATGGAGACTTATTCTCCTACACCGACATCATATCAACATACCAATATGGAATTCAACTAT 
GGTAACTATTCGGCCGCGGCAGTGACGGCAACCGTGGATTATCCAGTACCGATGACCGTT 
GATGATCAAACCGGTGAAAACTATTGGGGCATGGATGATACT 

TTGAATGGTAATTGATTGATCGGTGGACAAAACATGGAATATTAATTGAGTATTATATAT 

GATTTTTAGGAGTACTATTATTAGTACGTGACATGTATATGTTTTTGCCTCGTTGTAGAG 

GTTTGGGGTTATAATTAATATATAATGTTATCTAATATGCAACCTTGATACATATTTC 

TCTTTATTGAACCCATGTTATACATAAATAAAATTGTTGAAGGGGTCATAAAAAAAAAAA 

AAAAAAAAAAAAA 

>G221 Amino Acid Sequence (domain in AA coordinates: 21-125) 
MEKRGGGS SGGSGS SAEAEWKGPWTMEEDL ILINYI^ 

CRLRWLNYLRPDVRRGNTTPEEQL I IMELHAKWGNRWS KI AKHLPGRTDNEI KNFCRTRI 

QKYIKQSDVTTTSSVGSHHSSEINDQAASTSSHN^ 

NYGNYS AAAVTATVDYPVPMTVDDQTGENYWGMDDI WS SMHLLNGN* 

>G2290 (119.. 982) 

TTCTTTCTTTCTTTCTTTCT^ 

TCTCTACCTCTCTTTCTCTATCTTCTCTTATCACTACTTCTCTCGCCGATCAATCATCAT 
GAACGATCCTXjATAATCCCGATCTGAGCT^CGACGACTCTGCTTGGAGAGAACTCACACT 
CAC^GCTCAAGATTCTGACTTCTTCGACCGAGACACTTCCAATATCCTCTCTGACTTCGG 
TTGGAACCTCCACCACTCCTCCGATCATCCTCACAGTCTCAGATTCGACTCCGATTTAAC 
ACAAACCACCGGAGTCAAACCTACCACCGTCACTTCTTCTTGTTCCTCATCCGCCGCCGT 
TTCCGTTGCCGTTACCTCTACTAATAAT7VATCCCTCAGCTACCTCAAGTTCAAGTGAAGA 
TCCGGCCGAGAACTCAACCGCCTCCGCCGAGAAAACACCACCACCGGAGACACCAGTGAA 
GGAGAAGAAGAAGGCTCAAAAGCGAATTCGGCAACCAAGATTCGCATTCATGACCAAGAG 
TGATGTGGATAATCTTGAAGATGGATATCGATGGCGTAAATATGGACAAAAAGCCGTCAA 
GAATAGCCCATTCCCAAGGAGCTACTATAGATGCACAAACAGCAGATGCACGGTGAAGAA 
GAGAGTAGAACGTTCATCAGATGATCCATCGATAGTGATCACAACATACGAAGGACAACA 
TTGCCATCAAACCATTGGATTCCCTCGTGGTGGAATCCTCACTGCACACGACCCACATAG 
CTTCACTTCTCATCATCATCTCCCTCCTCCATTACCAAATCCTTATTATTACCAAGAACT 
CCTTCATCAACTTCACAGAGACAATAATGCTCCTTCACCGCGGTTACCCCGACCTACTAC 
TGAAGATACACCTGCCGTGTCTACTCCATCAGAGGAAGGCTTACTTGGTGATATTGTACC 
TCAAACTATGCGCAACCCTTGAGGTAAGCTTGGTACGTAGCAATAGCTAAGGAGdTGCTA 
ACTCATTATATATAGAAGATATTGCAGACCAGAATATGCGCAGGGAGGGTATAACAATAT 
GGCGTTGTAACAATGGATCTATATATTACCTCATTGTTGATCAATAGCACACCACCGGTA 
CGTTTGCAATTTCTTCATGTATATTTCTTGTTATATATGTAGTTATATATCCAGGTATAA 
TTTTGATGTAACACAACATTAATCTTAATCGTGGATCCATCCCACATTTGATGCATGTAT 
GTGCACTTAAGAAAAAGAACATGGAGGAAATAACGTTATTTTTTATTATTCT 

>G2290 Amino Acid Sequence (conserved domain in AA coordinates : 147-205) 

MNDPDNPDLSNDDSAWRELTLTAQDSDFFDRDTSNILSDFGWNLHHSSDHPHSLRFDSDL 

TQTTGVTQ?TTVTSSCSSSAAVSVAWSTN 

KEK^CKAQKRIRQPRFAFMTKSDvTDNLEDGYRWRKYGQKAVlCNSPFPRSYYR 
KRVERSSDDPSIVITTYEGQHCHQTIGFPRGGILTAHDPHSFTSHHHLPPPLPNPYYYQE 
LLHQLHRDNNAPSPRLPRPTTEDTPAVSTPSEEGLLGDIVPQTMRNP* 
>G2299 (231.. 941) 

GCCAAAATTTTACCAACATTTTTCTCTTCTCATATC^\AAGTTTCTCTCTCATTTCTTCAT 
CACACTTCACTGCCCTGTTTTTTTTCCTC^TTTTGAATAGTTCTCAAACTTATATATTTT 
TCCCCCTGAAGCCTAGCTATTTCTTTTTATTTGCATTAATCTCGGGATCCGAATCGAAAA 
AAGCAATCAGAATAATAGACTTGTACGATACTTGTGCCTAAGCTAACACAATGGCAGAGG 
AATACTACAGCCTCCGCTCGGAGAGAGTAACTCAGCTTCTTGTCCCTAACTCGGAGTCTG 
ACTCAGTGAGTGACAAAAGCAAAGCTGAGCAAAGCGAGAAGAAGACTAAACGTGGGAGAG 
ACTCCGGTAAACACCCTGTTTATCGCGGAGTAAGGATGAGGAACTGGGGAAAATGGGTGT 
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CGGAGATTCGTGAGCCGAGGAAGAAATCACGTATTTGGCTGGGAACTTTCCCGACGCCGG 
AGATGGCGGCGCGTGCACACGACGTGGCGGCTCTGAGCATTAAAGGAACGGCCGCTATAC 
TAAACTTCCCTGAACTCGCTGACTCATTCCCTCGACCCGTTTCATTAAGCCCTCGAGACA 
TTCAGACAGCAGCTCTTAAAGCAGCTCAC^^ 

CGTCTTCGTCGTCGTCTTTGTCTTCTACGTCTTCGCTCGAGTCTCTTGTGTTGGTGATGG 
ACCTCTCGAGGACTGAGTCGGAGGAGCTCGGTGAGATTGTGGAGCTTCCAAGTCTCGGGG 

ACTACTGTTTATATCCGCCGCCGTGGGGACAGTCGTCCGAAGATAACTATGGTCACGGAA 
TTAGCCCTAATTTTGGCCATGGCTTGTGATGG 
ACCATAATGTTTTGTTTAAAACAGTTTATTTTGTATCATTGCC^ 
CACGTTTTTAAAACCCTTTGCTGTTTTTGTTTT^ 

>G2299 Amino Acid Sequence (conserved domain in AA coordinates : 48-115) 

MAEEYYSLRSERWQLLVWSESDSVSDKSKAEQSEKKT^ 

KWSEIREPRKKSRIWLGTFPTPEMAARAHDVAAL 

PRDIQTAALKAAHl^PTTSFSSSTSSSSSLSSTSSLESLvliVT^LSRTESEELGEIVELP 
SLGAS YDVDS ANLGNEFVF YDS VDYCLYPPPWGQS SEDNYGHGI S PNFGHGLS WDL* 
>G2340 (274.. 1275) 

ATACAAAACTCCCTCTTCTCTATCTTCTTCATCTTAAAGAAAAAATAAGAGATATTCGTA 

AAGAGAGAACACAAAATTTCAGTTTACGAAAAGCTAGCAAAGTCGAGTATCGAGGAATAA 

CAGAATAAGACGTATCTATCCTTGCCTTAATGTTCTTACCAAAAGATCTAGTCCTTTCTT 

TGTATGATCGATCCATC^CAAGCCCACAACAACAAC^CTACATCTCTTTCTCTATCTOT 

AGCTTCTATTTTTAATACATTCAAGAATCAAGAATGGTACGGACGCCGTGTTGTAGAGCA 

GAAGGGTTGAAGAAAGGAGCATGGACTCAAGAAGAAGACCAAAAGCTTATCGCCTATGTT 

CAACGACATGGTGAAGGCGGTTGGCGAACCCTTCTOGAC7U\AGCTGGACTCA7^AAGATGT 

GGCAAAAGCTGC^GATTGAGATGGGCGAATTACTTAAGACCTGACATTAAACGTGGAGAG 

TTTAGCCAAGACGAGGAAGATTCCATCATCAACCTCCACGCCATTCATGGC^ 

TCGGCCATAGCTCGTAAAATACCAAGAAGAACAGACAATGAGATCAAGAACCATTGGAAC 

ACTCACATCAAGAAATGTCTGGTCAAGAAAGGTAT^ 

CTCGATGGAGCCGGTAAATCATCTGACCATTCCGCGCATCCCGAGAAAAGCAGCGTTCAT 
GACGACAAAGATGATCAGAATTCAAATAACAAAAAGTTC 

TTTTTGAACAGAGTAGCAAACAGATTCGGTCATAGAATCAACCACAATGTTCTGTCTGAT 
ATTATTGGAAGTAATGGCCTACTTACTAGTCACACTACTCCAACTACAAGTGTTTCAGAA 
GGTGAGAGGTCAACGAGTTCTTCCTCCACACATACCTCTTCGAATCTCCCCATCAACCGT 
AGCATAACCGTTGATGCAACATCTCTATCCTCATCCACGTTCTCTGACTCCCCCGACCCG 
TGTTTATACGAGGAAATAGTCGGTGACATTGAAGATATGACGAGATTTTCATCAAGATGT 
TTGAGT(^TGTTTTATCTCATGAAGATTTATTGATGTCCGTTGAGTCTTGTTTGGAGAAT 
ACTTCATTCATGAGGGAAATTACAATGATCTTTCAAGAGGATAAAATCGAGACGACGTCG 
TTTAATGATAGCTACGTGACGCCGATCAATGAAGTTGATGACTCCTGTGAAGGGATTGAC 
AATTATTTTGGATGAGTTATATTGATGATGATGAAAATTTGCATTTGGCATGTAAATCAA 
TTAGAGTTTGATTTGCTATGGTGTTTTTAGTTTGTGTGTGTAGTGTGTTTCGACCGTCAA 
AAAAAAAAAAAAAAAAAAAAAAAAAA 

>G2340 Amino Acid Sequence (domain in AA coordinates : 14-120) 
MVRTPCCRAEGLKKGAWTQEEDQKL IAYVQRHGEGGWRTLPDKAGLKRCGKS CRLRWANY 
LRPDI KRGEFSQDEEDS I INLHAIHGNKWSAI ARKI PRRTDNEI KNHWNTH IKKCLVKKG 
IDPLTHKSLLDGAGKSSDHSAHPEKSSVT1DDKDDQNSNN 

RINHNVLSD 1 1 GSNGLLTSHTTPTTS VS EGERSTS S S S THTS SNLP INRS I TVDATSLS S 
S TFSDS PDPCL YEE I VGD I EDMTRFS SRCLSHVL SHEDLLMS VE S CLENTS FMRE ITMI F 
QEDKIETTSFNDSYWPINEVDDSCEGIDNYFG* 
>G2346 (1..1011) 

ATGGAGTTGTTAATGTGTTCGGGTCAGGCCGAGTCAGGTGGTTCTTCTTCCACCGAGTCT 
TCTTCACTCAGTGGTGGACTCAGGTTTGGTCAGAAGATCTACTTCGAGGATGGATCCGGA 
TCCAGAAGCAAGAACCGGGTCAATACCGTTCGTAAGTCGTCTACCACGGCGAGGTGCCAA 
GTGGAAGGTTGTAGAATGGATCTAAGCAATGTTAAAGCTTATTACTCGAGACACAAAGTT 
TGTTGCATTCACTCTAAATCATCTA^ 

CAACAATGTAGCAGGTTTCACCAGCTTTCTGAGTTTGACTTGGAGAAAAGAAGTTGTCGC 
AGAAGACTCGCTTGTCATAACGAACGACGAAGAAAACCACAACCCACAACGGCTCTTTTC 
ACTTCTCATTACTCTCGAATCGCTCCATCTCTTTACGGAAACCCCAATGCTGCAATGATT' . 
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AAAAGCGTTTTGGGAGATCCTACTGCGTGGTCAACCGCAAGATCAGTGATGCAGCGGCCT 
GGACCGTGGCAGATTAATCCAGTTAGGGAAACCCATCC 

GGAAGCTCAAGCTTTACTACATGTCCAGAGATGATAAACAACAATAGCACAGATTCAAGC 
TGTGCTCTCTCTCTTCTGTCAAACTCATACCCAATTCATCAGCAGCAACTTCAGACACCA 
ACAAATACATGGCGACCATCTTCTGGTTTCGACTCGATGATCTCATTCTCCGATAAGGTT 
AC^TGGCTCAGCCACCGCCCATTTCAACCCATC^^ 

TACCTCAGCCAAACTTGGGAAGTCATCGCGGGCGAAAAGAGCAATTCACATTATATGTCT 

CCTGTGAGTCAAATCTCGGAGCCAGCAGATTTCCAGATAAGCAATGGCAGTGTGTCGCCC 

TATTCTCCTCCGTCCTTACTATCTCTTGTGTGCTACTTGCGGCCGCTATAG 

>G2346 Amino Acid Sequence (domain in AA coordinates: 59-135) 

MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTTOiCSSTTARCQ 

VEGCI^LSNVKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCR 

RRIACHNERRRIO?QPTT7^FTSHYSRIAPSLYGNPNAAMIKSVLGDPTAWSTARSVMQRP 

GPWQINPWETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP 

TimraPSSGFDSMISFSDKVTMAQPPPISTHQPPI^ 

PVSQISEPADFQISNGSVSPYSPPSIiLSLVCYLRPL* 

>G237 (1..852) 

ATGGCGAAGACGAAATATGGAGAGAGACATAGGAAAGGGTTATGGTCACCTGAAGAAGAC 
GAGAAGCTAAGGAGCTTCATCCTCTCTTATGGCCATTCTTGCTGGACCACTGTTCCCATC 
AAAGCTGGGTTACAAAGGAATGGGAAGAGCTGCAGATTAAGATGGATTAATTACCTAAGA 
CCAGGGTTAAAGAGGGATATGATTAGTGCAGAAGAAGAAGAGACTATCTTGACGTTTCAT 
TCTCCCTTGGGTAACAAGTGGTCGCAAATAGCTAAATTCTTACCGGGAAGAACAGACAAT 
GAGATAAAGAACTATTGGCACTCTCATTTGAAAAAGAAATGGCTCAAGTCTCAGAGCTTA 
CAAGATGCAAAATCTATTTCCCCTCCTTCGTCTTCATCATCATCACTTGTTGCTTGTGGA 
GAAAGAAATCCGGAAACCTTGATCTCGAATCACGTGTTCTCCCTCCAGAGACTTCTAGAG 
AACAAATCTTCATCTCCCTCACAAGAAAGCAACGGAAATAACAGCCATCAATGTTCTTCT 
G'CTCCTGAGATTCCAAGGCTTTTCTTCTCTGAATGGCTTTCTTCTTCATATCCCCACACC 
GATTATTCCTCTGAGTTTACCGACTCTAAGCACAGTCAAGCTCCAAATGTCGAAGAGACT 
CTCTCAGCTTATGAAGAAATGGGTGATGTTGATCAGTTCCATTACAACGAAATGATGATC 
AACAACAGCAACTGGACTCTTAACGACATTGTGTTTGGTTCCAAATGTAAGAAGCAGGAG 
CATCATATTTATAGAGAGGCTTCAGATTGTAATTCTTCTGCTGAATTCTTTTCTCCACCA 
ACAACGACGTAAATTGCGTTTATTGTAATGTAAATCAAATTTCTAAGGCAAAACCGGAAA 
AAAAAAAAAAAAAAAAAAAA 

>G237 Amino Acid Sequence (domain in AA coordinates: 11-113) 
MAKTKYGERHRKGLWSPEEDEKLRSFILSYGHSCWTTVPIKAGLQRNGKSCRLRWINYLR 
PGLKRDMISAEEEETILTFHSPLG1TKWSQIAKFLP 

QDAKSISPPSSSSSSLVACGERNPETLISNHVFSLQRLLENKSSSPSQESNGNNSHQCSS 
APEIPRLFFSEWLSSSYPHTDYSSEFTDSKHSQAPNVEETLSAYEEMGDVDQFHYNEMMI 
NNSNWTLNDI VFGS KCKKQEHH I YREASDCNS S AE FFS PPTTT * 
>G2373 (48.. 1199) 

GCAAAATCCTCAGATCGTCTTACCTTCTCCGAATCGATCGATTTTTCATGGAGGACGACG 
ACGAGATTCAGTCAATTCCATCTCCGGGAGATTCTTCCCTTTCACCACAAGCTCCTCCTT 
CTCCGCCGATTTTGCCAACAAACGACGTGACGGTGGCCGTCGTGAAGAAACCACAACCGG 
GGCTTTCTTCTCAATCTCCGTCCATGAACGCTTTAGCGTTAGTGGTTCATACTCCTTCTG 
TAACCGGTGGTGGTGGTAGCGGAAACAGAAACGGACGAGGAGGAGGAGGAGGAAGCGGTG 
GTGGTGGAGGAGGAAGAGATGATTGTTGGAGCGAAGAAGCTACAAAGGTTCTAATCGAAG 
CTTGGGGAGATCGATTCTCTGAACCAGGTAAAGGAACTTTGAAGCAACAACATTGGAAAG 
AAGTAGCTGAGATTSTGAACAAGAGTCGTCAATGCAAATACCCTAAAACTGATATTCAGT 
GTAAGAACAGAATTGATACGGTGAAGAAGAAGTATAAGCAAGAGAAAGCTAAGATTGCTT 
CTGGTGATGGACCTAGTAAATGGGTTTTCTTCAAGAAGCTTGAGAGTTTGATTGGTGGTA 

ctacaacattcattgcttcttcaaaagcttcagagaaggctcctatgggaggagctcttg 
ggaatagccgttcgagtatgtttaaacggc^aactaaaggtaatcagattgtg'cagcaag 
aacaagagaagagaggctctgattcgatgcggtggcattttaggaaacgtagtgcttctg 
agactgagtctgagtctgatcctgaacctgaggcttctcctgaggaatctgctgagagtc 
tcccacctttgcaaccgattcaaccgctttcgtttcatatgccaaagcggttgaaggtgg 

ATAAGAGTGGAGGTGGAGGGAGTGGAGTTGGAGATGTGGCGAGGGCGATACTTGGATTTA 
CGGAAGCTTATGAGAAGGCGGAAACTGCTAAGCTTAAGTTAATGGCGGAACTGGAAAAGG 
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AGAGGATGAAATTTGCTAAAGAGATGGAGTTGCAGAGAATGCAGTTCTTGAAAACTCAAT 
TGGAGATAACACAGAACAATCAAGAAGAGGAAGAGAGGAGCAGGCAGCGAGGAGAAAGGA 
GGATCGTTGATGATGATGATGATCGCAATGGCAAGAATAACGGCAATGTAAGTAGCTGAC 
AATTGAACACACAAATGTTCCTATGATATTTGCTATGATAAGCTGGATTTTAGGTTTTGA 
TGG 

>G2373 Amino Acid Sequence (domain in AA coordinates : 290-350) 
MEDDDEIQSIPSPGDSSLSPQAPPSPPILPTNDVTVAWK^^ 

HTPSVTGGGGSGNRNGRGGGGGSGGGGGGRDDCWSEEATKVLIEAWGDRFSEPGKGTLKQ 
QHWKEVAEIWKSRQCKYPKTDIQCKNRIDTVKKKYKQEKAKIAS 

LIGGTTTFIASSKASEKAPMGGALGNSRSSMFKRQTKGNQIVQQQQEKRGSDSMRWHFRK 
RSASETESESDPEPEASPEESAESLPPLQPIQPLSFHMPKRLKVDKSGGGGSGVGDVARA 
ILGFTEAYEKAETAKLKLMAELEKERMKFAKEMELQRMQFLKTQLEITQNNQEEEERS 
RGERRI VDDDDDRNGKNNGNVSS * 
>G2376 (39.. 1370) 

CACGAGCTTCTGACTCAGATCCGGCGATATCGAATTCCATGGAGGACGATGAAGACATCC 
GATCTCAGGGTTCCGATTCACCTGATCCGTCTTCCTCCCCGCCGGCGGGACGAATCACGG 
TTACGGTGGCTTCGGCAGGTCCGCCTTCTTATTCTCTGACTCCTCCGGGTAATTCGTCGC 
AGAAGGATCCGGATGCGTTGGCTCTGGCGCTGCTTCCGATTCAGGCCAGCGGTGGAGGGA 
ATAACAGCAGTGGGAGACCAACCGGCGGCGGCGGGAGGGAGGATTGTTGGAGCGAAGCAG 
CTACGGCTGTGTTGATTGATGCGTGGGGTGAGAGATACTTGGAGCTTAGCAGAGGGAATC 
TGAAGCAGAAGCACTGGAAAGAGGTGGCTGAGATTGTGAGCAGCAGAGAGGATTACGGTA 
AAATTCCCAAAACTGATATACAGTGTAAGAATAGGATCGATACGGTGAAGAAGAAGTATA 
AACAAGAGAAGGTGAGAATCGCTAACGGCGGTGGCCGTAGCAGATGGGTGTTCTTCGACA 
AGCTTGACCGTCTGATTGGATCAACGGCGAAGATCCCGACGGCAACTTCTGGAGTCAGCG 
GTCCTGTCGGAGGATTGCATAAGATTCCTATGGGTATTCCAATGGGAAGTCGTTCGAATC 
TGTACCATCAGCAAGCTAAGGCTGCAACACCGCCTTTCAATAATCTTGACCGGTTAATTG 
GAGCTACGGCTAGAGTCTCAGCTGCTTCTTTCGGTGGCAGTGGTGGAGGAGGCGGAGGAG 
GATCTGTCAATGTACCTATGGGAATTCCGATGAGTAGCCGTTCAGCTCCGTTTGGACAGC 
AAGGGAGGACTCTGCCACAGCAAGGTAGGACACTGCCACAGCAACAGCAGCAAGGGATGA 
TGGTGAAAAGGTGTAGTGAGTCGAAACGCTGGCGTTTCAGGAAGAGGAACGCTTCTGATT 
CAGACTCGGAATCTGAAGCAGCAATGTCAGATGATTCCGGTGACAGTTTACCACCTCCTC 
CTCTGTCGAAGAGGATGAAGACGGAGGAGAAGAAGAAGCAAGATGGTGATGGAGTGGGGA 
ACAAATGGAGGGAGCTGACTCGGGCAATCATGAGATTCGGTGAAGCTTATGAGCAAACAG 
AGAATGCGAAACTGCAACAGGTGGTTGAGATGGAGAAAGAGAGGATGAAGTTCTTGAAGG 
AGCTTGAGTTGCAGAGAATGCAGTTCTTTGTGAAGACTCAATTGGAGATATCACAACTTA 
AGCAGCAACATGGGAGGAGAATGGGAAACACCAGTAATGATCATCATCACAGCCGCAAGA 
ACAACATGAATGCGATTGTGAACAACAACAACGATTTGGGTAATAACTAGAATTTAGTGA 
TGCAGTGTCGTAATTGATATATTTTAGATTTGAG 

>G2376 Amino Acid Sequence (domain in AA coordinates : 79-178 , 336-408) 

MEDDEDIRSQGSDSPDPSSSPPAGRITVTVASAGPPSYSLTPPGNSSQKDPDALALALLP 

IQASGGGNNSSGRPTGGGGREDCWSEAATAVLIDAWGERYLELSRGNLKQKHWKEVAEIV 

SSREDYGKIPKTDIQCKNRIDTVKKKYKQEKVRIANGGGRSRWVFFDKLDRLIGSTAKIP 

TATS GVSGPVGGLHKI PMG I PMGSRSNL YHQQAKAATPPFNNLDRL IGATARVS AASFGG 

SGGGGGGGSVWPMGIPMSSRSAPFGQQGRTLPQQGRTLPQQQQQGMMVKRCSESKRWRF 

RKRNASDSDSES EAAMSDDSGDSLPPPPLS KRMKTEEKKKQDGDGVGNKWRELTRAIMRF 

GEAYEQTENAKLQQVVEMEKERMKFLKELELQRMQFFVKTQLEISQLKQQHGRRMGNTSN 

DHHHSRKNNINAIVNNNNDLGNN * 

>G24 (194.. 724^- 

CGGACGCGTGGGCAAATATTAAAATAAAAAGTGTCGGTGAATTCTCAATCTTTGTCTTCT 
TTCGTCGTCTCTTTAAAACTCCTCCGTCCCTCCTTATTATGTAACCGTCTCGCCGTCAAA 
TTTTCAAAATCTCTCCCTCCGTTCATAAACCCAGATCGAAATTTATGGTTTTGTAATTTT 
TTTACCGGCGGTTATGGAGACGGAAGCGGCGGTGACAGCGACGGTTACGGCGGCGACGAT 
GGGGATTGGGACGAGGAAGAGAGATCTGAAACCGTATAAAGGAATACGAATGAGGAAATG 
GGGGAAATGGGTGGCGGAGATACGGGAACCGAATAAGAGATCAAGGATCTGGTTAGGTTC 
TTATGCGACGCCTGAAGCGGCGGCGAGAGCTTACGACACTGCTGTTTTTTACCTCCGTGG 
TCCTTCAGCGAGGCTTAATTTTCCGGAGCTTTTGGCTGGACTTACTGTTTCTAACGGCGG 
AGGAAGAGGTGGTGATTTATCGGCGGCGTATATTAGGAGAAAAGCGGCGGAGGTTGGTGC 
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TCAGGTTGATGCGCTTGGAGCGACGGTGGTTGTGAATACCGGCGGCGAGAATCGCGGTGA 
TTACGAGAAGATTGAGAATTGTCGTAAGAGCGGTAACGGGTCATTGGAACGGGTCGATTT 
GAATAAATTACCCGACCCGGAAAATTCGGATGGTGATGATGACGAATGTGTGAAAAGAAG 
ATAGAAAAAATAAAAAGTAGTTGTAGAAGGAGAGACGAGAATGTTTGTCTTTAAGATGCG 
CTGTTGCCGCTAACATGCGCTTTCGATTTT^^ 

GGTTTTGTTTTCGTCGTCGATAATCAAAGATTTTAAAACACAATTCTCAAATTTCT 

TGTTACAAACTAGATTTGCATGATCTTTGTATTAACGAATAACGATTAAGTCCTAAA 

>G24 Amino Acid Sequence (domain in AA coordinates: 25-93) 

METEAAVTATVTAATMGIGTRKRDLKPYKG I RMRKWGKWVAE IREPNKRSRI WLGS YATP 

EAAARAYDTAVFYLRGPSARLNFPELLAGLTVSNGGGRGGDLSAAYIRRKAAEVGAQVDA 

LGATVVWTGGENRGDYEKlENraKSGNGSLERVDLNKLPDPENSIX3DD 

>G2424 (1..999) 

ATGAGGATGGAGATGGTGCATGCTGACGTGGCGTCT 

TCTTCTTTGTCTTCGTCCTCACATCATCACTATAACCAACAACAACATTGTATCATGT^ 

GAAGATCAACACCATTCGATGGATCAGACCACTTCATCGGACTACTTCTCTTTAAATATC 

GACAATGCTCAACATCTCCGTAGCTACTACACAAGTCATAGAGAAGAAGACATGAACCCT 

AATCTAAGTGATTACAGTAATTGO^CAAGAAAGACAC^CAGTCTATAGAAGCTGTGGA 

CACTCGTCAAAAGCTTCGGTGTCTAGAGGACATTGGAGACCAGCTGAAGATACTAAGCTC 

AAAGAACTAGTCGCCGTCTACGGTCCACAAAACTGGAACCTCATAGCTGAGAAGCTCCAA 

GGAAGATCCGGGAAAAGCTGTAGGCTTCGATGGTTTAACCAACTAGACCCAAGGATAAAT 

AGAAGAGCCTTCACTGAGGAAGAAGAAGAGAGGCTAATGCAAGCTCATAGGCTTTATGGT 

AACAAATGGGCGATGATAGCGAGGCTTTTCCCTGGTAGGACTGATAATTCTGTGAAGAAC 

CATTGGCATGTTATAATGGCTCGCAAGTTTAGGGAACAATCTTCTTCTTACCGTAGGAGG 

AAGACGATGGTTTCTCTTAAGCCACTCATTAACCCTAATCCTCACATTTTC^TGATTTT 

GACCCTACCCGGTTAGCTTTGACCCACCTTGCTAGTAGTGACCATAAGCAGCTTATGTTA 

CCAGTTCCTTGCTTCCCAGGTTATGATCATGAAAATGAGAGTCCATTAATGGTGGATATG 

TTCGAAACCCAAATGATGGTTGGCGATTAC^CTGCATGG 

GATTTCTTAAACCAAACCGGGAAGAGTGAGATATTTGAAAGAATCAATGAGGAGAAGAAA 
CCACCATTTTTCGATTTTCTTGGGTTGGGGACGGTGTGA 

>G2424 Amino Acid Sequence (conserved domain in AA coordinates : 107-219) 

MRMEMVHADVASLSITPCFPSSLSSSSHHHYNQQQHCIMSEDQHHSMDQTTSSDYFSLNI 

DNAQHLRS YYTSHREED^PNLSDYSNCNKKDTTVYRS CGHSS KASVSRGHWRPAEDTKL 

KELVAVYGPQmmLIAEKLQGRSGKSCRLRWFNQLDPRINRRAFTEEEEERLMQAHRLYG 

NKWAMIARLFPGRTDNSVICNHWHVIMARKFREQSSSYRRRK^ 

DPTRLALTHIiAS SDHKQLMLPVPC FPGYDHENE S PLMVDMFETQMMVGD YI AWTQEATTF 

DFLNQTGKSEIFERINEEKKPPFFDFLGLGTV* 

>G2505 (1..1026) 

ATGGGTTCTTCGTCGAACGGAGGAGTGCCACCTGGTTTCCGGTTTCATCCGACGGACGAA 
GAGCTTCTCCATTACTACTTGAAGAAGAAAATCTCTTACCAAAAGTTTGAGATGGAAGTC 
ATCAGAGAGGTTGACTTAAACAAGCTTGAGCCTTGGGATTTGCAAGAGAGATGCAAGATA 
GGATCAACACCACAiU^CGAATGGTACTTCTTCAGCC^CAAGGACAGGAAATATCCGACG 
GGGTCAAGGACCAACCGTGCTACTCATGCAGGGTTCTGGAAGGCGACGGGACGTGACAAG 
TGCATAAGGAACTCTTACAAAAAGATAGGAATGAGGAAGACACTTGTGTTCTACAAAGGT 
AGAGCTCCTCATGGCCAAAAGACTGATTGGATCATGCATGAGTACCGTCTTGAAGACGCT 
GATGATCCTCAAGCCAACCCTAGTGAAGATGGATGGGTGGTATGTAGAGTGTTTATGAAG 
AAAAATTTGTTCAAGGTAGTAAATGAAGGTAGCTCAAGCATTAACTCATTGGACCAACAC 
AACCATGACGCATCTAACAACAACCATGCACTTCAAGCTCGTAGCTTTATGCACGGAGAC 
AGTCCATACCAGCTAGTACGTAACCACGGAGCCATGACATTCGAACTTAACAAGCCTGAC 
CTTGCTCTTCATCAATACCCACCAATCTTCCACAAGCCACCTTCACTTGGATTTGACTAC 
TCTTCAGGACTTGCAAGGGACAGTGAGAGTGCGGCTAGTGAAGGGTTACAATACCAGCAA 
GCGTGTGAGCCGGGTTTAGACGTTGGTACATGTGAGACAGTGGCTAGTCATAATCATCAA 
CAAGGTCTAGGTGAATGGGCAATGATGGATAGACTTGTGACTTGTCACATGGGAAATGAA 
GATTCCTCTAGAGGGATTACGTATGAGGATGGTAACAACAATTCGTCCTCTGTGGTTCAG 
CCAGTTCCCGCGACGAACCAGCTAACATTGCGTAGTGAGATGGATTTCTGGGGTTATTCT 
AAATAG 

>G2505 Amino Acid Sequence (domain in AA coordinates: 10-159) 
MGSSSNGGVPPGFRFHPTDEELLHYYLKKKISYQKFEMEVIREVDLNKLEPWDLQERCKI 
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GSTPQNEWYFFSHKDRKYPTGSRTNR^ 

RAPHGQKTDWIMHEYRLEDADDPQANPSEDGWWCRWMKKNLFKVVNEGSSSINSLDQH 
NHD AS NNNHALQ AR S FMHRD S P YQL VRNHG AT^ITFELNKPDLALHQ YP P I FHKP P S LG FD Y 
SSGLARDSESAASEGLQYQQACEPGLDVGTCETVASHN^^ 
DSSRGITYEDGNNNSSSWQPVPATNQLTLRSEMDFWGYSK* 
>G2512 (64.. 798) 

AACTTAGTGCCACTTAGACACAATAAGAAAACCGTTAACAAGAAGAAAAAAAAAAGATCG 
AAAATGGAATATCAAACTAACTTCTTAAGTGGAGAGTTTTCCCCGGAGAACTCTTCTTCA 
AGCTCATGGAGCTGACAAGAATCATTCTTGTGGGAAGAGAGTTTC 

GACCAATCCTTCCTTTTATCTAGCCCTACTGATAACTACTGTGATGACTTCTTTGCATTT 
GAATCATCAATCATAAAAGAAGAAGGAAAAGAAGCCACCGTGGCGGCCGAGGAGGAGGAG 
AAGTCATACAGAGGAGTGAGGAAACGGCCGTGGGGGAAATTCGCGGCCGAGATAAGAGAC 
TCAACGAGGAAAGGGATAAGAGTGTGGCTTGGGACATTCGACACCGCGGAGGCGGCGGCT 
CTCGCTTATGATCAGGCGGCTTTCGCTTTGAAAGGCAGCCTCGCAGTACTCAATTTCCCC 
GCGGATGTCGTTGAAGAATCTCTCCGGAAGATGGAGAATGTGAATCTCAATGATGGAGAG 
TCTCCGGTGATAGCCTTGAAGAGAAAACACTCCATGAGAAACCGTCCTAGAGGAAAGAAG 
AAATCTTCTTCTTCTTCGACGTTGACATCTTCTCCTTCTTCCTCCTCCTCCTATTCATCT 
TCTTCGTCTTCTTCTTCTTTGTCGTCAAGAAGTAGAAAACAGAGTGTTGTTATGACGCAA 
GAAAGTAATACAACACTTGTGGTTCTTGAGGATTTAGGTGCTGAATACTTAGAAGAGCTT 
ATGAGATCATGTTCTTGATAATCTCTGCTTCTAC^TTTTTATGTAATTTGA 

>G2512 Amino Acid Sequence (conserved domain in AA coordinates : 79-139) 
MEYQTNFLSGEFSPENSSSSSWSSQESFLWEESFLHQSFDQSFLLSSPTDNYCDDFFAFE 
SSIIKEEGKEATVAAEEEEKSYRGWKRPWGKFAAEIRDSTRKGIRVWLGTFDTAEAAAL 
AYDQAAFALKGSLAVLNFPADWEESLRKMENVNM 

SSSSSTLTSSPSSSSSYSSSSSSSSLSSRSRKQSWMTQESNTTLWLEDLGAEYLEELM 
RSCS* 

>G2513 (69.. 698) 

TTTCAACAGTAATTTAAGTTAACCGGAGTCTCTTTTTGTTTTCCGGCGAATTTTTGGTAC 
TTTGAGTTATGAATAATGATGATATTATTCTGGCGGAGATGAGGCCTAAGAAGCGTGCGG 
GAAGGAGAGTGTTTAAGGAGACACGTCACCCAGTTTACAGAGGCATAAGGCGGAGGAACG 
GTGACAAATGGGTCTGCGAAGTCAGAGAACCGACGCACCAACGCCGCATTTGGCTCGGGA 
CTTATCCCACAGCAGATATGGCAGCGCGTGCACACGACGTGGCGGTTTTAGCTCTGCGTG 
GGAGATCCGCATGTTTGAATTTCGCCGACTCCGCTTGGCGGCTTCCGGTGCCGGAATCCA 
ATGATCCGGATGTGATAAGAAGAGTTGCGGCGGAAGCTGCGGAGATGTTTAGGCCGGTGG 
ATTTAGAAAGTGGAATTACGGTTTTGCCTTGTGCGGGAGATGATGTGGATTTGGGTTTTG 
GTTCGGGTTCCGGCTCTGGTTCGGGATCGGAGGAGAGGAATTCTTCTTCGTATGGATTTG 
GAGACTACGAAGAAGTCTCAACGACGATGATGAGACTCGCGGAGGGGCCACTAATGTCGC 
CGCCGCGATCGTATATGGAAGACATGACTCCTACTAATGTTTACACGGAAGAAGAGATGT 
GTTATGAAGATATGTCATTGTGGAGTTACAGATATTAAGTGGGACTCACATATCTACTAT 
ACATAATATTTAGCTTTTATGTAAGAGGTATTTATGTGAGTTTTAAGATTGTAGATGTGT 
CCCAGGCGTTAGAAGTTTCCTTGATGGTATGGAATCTTTGTACCTATAAAATTATAAAAT 
T 

>G2513 Amino Acid Sequence (domain in AA coordinates: TBD) 
MNNDDI I LAEMRPKKRAGRRVFKETRHPVYRGI RRRNGD KWVCEVREPTHQRRI WLGTYP 
TADMAARAHDVAVLALRGRSACLNFADS AWRL PVPESND PDVIRRVAAEAAEMFRPVDLE 
SGITVXiPCAGDDVI)LGFGSGSGSGSGSEERNSSSYGFGDYEEVSTTMMRLAEGPLMSPPR 
SYMEDMTPTNVYTEEEMCYEDMSLWSYRY* 
>G2519 (83.. 691) 

CAAAGTGAAAACATAAGATCATCTTCTTCGTTGATAGATCAATATAGGAACTCCAGAAGA 
GAATCTTGATC7UVTTAAGTATCATGTCTCACATCGCTGTTGAAAGGAATCGAAGAAGGCA 
AATGAACGAGCATCTTAAATCCCTTCGTTCTTTGACTCCTTGTTTCTACATCAAAAGGGG 
AGATCAAGCTTCGATCATCGGAGGAGTGATAGAGTTCATCAAAGAGTTGCAGCAATTGGT 
TCAAGTTCTTGAGTCCAAGAAACGTCGAAAGACCCTAAACCGACCATCTTTCCCTTATGA 
TCACCAGACAATCGAGCCATCC^GTTTAGGAGCCGCCACTACCCGAGTACCGTTTAGTCG 
AATCGAAAATGTGATGACCACAAGTACTTTCAAGGAAGTAGGAGCATGCTGTAACTCCCC 
TCATGCTAACGTAGAAGCAAAGATTTCAGGTTCTAATGTTGTATTGAGAGTTGTCTCTAG 
GCGAATCGTGGGGCAGCTCGTAAAGATCATCTCTGTCTTAGAGAAGCTATCTTTTCT^AGT 
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TCTTCACCTCAATATTAGTAGCATGGAGGAGACTGTCTTATACTTTTTCGTTGTTAAGAT 
AGGATTGGAGTGTCACTTAAGCTTGGAGGAGCTAACTCTTGAAGTTCAGAAAAGCTTTGT 
GTCTGATGAAGTGATCGTCTCTACCAATTAAAAACAAAATTCTACATGTACTAGAGCGTG 
TATCGTTTTTTGGGATTAATAATCATATAATCGTTAC^^ 

AATAAGCTCCTCTAAACAAAACCTTCTTTTTAAAAAAACACACTTATGTTTTACTTAGCT 
TGTTGTTGTATCCGAAGTTGATCAACGTTGTAATTTCCCACAATAAATCATGACATTTTA 
TATGCTCT 

>G2519 Amino Acid Sequence (domain in AA coordinates : 1-65) 

MSHI AVERNRRRQMNEHLKSLRSLTPCFYIKRGDQAS I IGGVTEFI KELQQLVQVLESKK 

RRICTLNRPSFPYDHQTIEPSSLGAATTRVPFSRIENVMT^ 

ISGSNVVLRWSRRIVGQLVKIISVLEKLSFQVLHLNISS^ 

LEELTLEVQKS FVSDEVI VSTN* 

>G2520 (133. .1197) 

AAGGAGTTTTGCATACTC^CC^GCCACAATC^TTTCTCTCTTCTCTATCTCTCTGGT^ 
TGAATCGGCGACGACTGAGTCAACTCGGTGTTGTTACTGGTTTCGTCGTATGTGTTGTAA 
CTGATTAAGTTGATGGATCCGAGTGGGATGATGAACGAAGGAGGACCGTTTAATCTAGCG 
GAGATCTGGCAGTTTCCGTTGAACGGAGTTTCAACCGCCGGAGATTCTTCTAGAAGAAGC 
TTCGTTGGACCGAATCAGTTCGGTGATGCTGATCTAACCAGAGCTGCTAACGGTGATCCA 
GCGCGTATGAGTCACGCGTTGTCTCAGGCGGTTATTGAAGGTATCTCCGGCGCTTGGAAA 
CGGAGGGAAGATGAGTCTAAGTCGGCGAAGATCGTCTCCACCATTGGCGCTAGTGAAGGT 
GAGAACAAAAGACAGAAGATAGATGAAGTGTGTGATGGGAAAGCAGAAGCAGAATCGCTA 
GGAACAGAGACGGAACAAAAGAAGCAACAGATGGAACCAACGAAAGATTATATTCATGTT 
CGAGCTAGAAGAGGTCAAGCTACTGATAGTCACAGTTTAGCTGAAAGAGCGAGAAGAGAG 
AAAATAAGTGAGCGGATGAAAATCTTGCAAGATCTTGTTCCGGGATGTAACAAGGTTATT 
GG AAAAGCACTTGTTCTAGATGAGATAATTAACTATATAGAATCATTGCAACGTCAAGTT 
GAGTTCTTATCGATGAAGCTTGAAGGAGTCAACTCAAGAATGAACCCTGGTATCGAGGTT 
TTTCCACCCAAAGAGGTGATGATTCTCATGATCATCAACTCAATCTTCTCCATTTTTTTC 
AC^VAAACAATACATGTTTCTATCGAGGTATTCTCGGGGTAGGAGTCTCGATGTTTATGCG 
GTTCGGTCATTTAAGCATTGCAATAAACGGAGTGACCTCTGTTTTTGCTCCTGCTCCCCA 
AAAACAGAACTTAAGACAACTATATTTTCAC2WU 

CGAGTAGGAGTCGCTATTAGTTCATCTAAGCATTGCAATGAACCGTTTGGTCAGCAAGCG 
TTTGAGAATCCGGAGATACAGTTCGGGTCGCAGTCTACGAGGGAATACAGTAGAGGAGCA 
TCACCAGAGTGGTTGCACATGCAGATAGGATCAGGTGGTTTCGAAAGAACGTCTTGA 
>G2520 Amino Acid Sequence (domain in AA coordinates: 135-206) 
mPSGMMNEGGPFNLAEIWQFPLN^ 

HALSQAVIEGISGAWKRREDESKSAKIVSTIGASEGENKRQKIDEVCDGKAEAESLGTET 
EQKKQQMEPTKDYIHVRARRGQATDSHSLAERARREKISERMKILQDLVPGCNKVIGKAL 
VLDE I INYI QSLQRQVEFLSMKLEAVNSRMNPGI EVFPPKEVMILMI INS IFS IFFTKQY 
MFLSRYSRGRSLDVYAVRSFKHCNKRSDLCFCSCSPKTELKTTIFSQNMTCFCRYSRVGV 
A I S S S KHCNE PFGQQAFENPE I QFGS QSTRE YSRGAS PE WLHMQIGSGGFERTS * 
>G2533 (1..1080) 

ATGATAAGCAAGGATCCAATATCGAGTTTACCTCCAGGGTTTCGATTTCATCCAACAGAT 
GAAGAACTCATTCTCCATTACCTAAGGAAGAAAGTTTCCTCTTCCCCAGTCCCGCTTTCG 
ATTATCGCCGATGTCGATATCTACAAATCCGATCCATGGGATTTACCAGCTAAGGCTCCA 
TTTGGGGAGAAAGAGTGGTATTTTTTCAGTCCGAGGGATAGGAAATATCCAAACGGAGCA 
AGACCAAACAGAGC7VGCTGCGTCTGGATATTGGAAAGCAACCGGAACZAGATAAATTGATT 
GCGGTACCAAATGGTGAAGGGTTTC^TGAAAACATTGGTATAAAAAAAGCTCTTGTGTTT 
TATAGAGGAAAGCCTCCAAAAGGTGTTAAAACCAATTGGATCATGCATGAATATCGTCTT 
GCCGATTCATTATCTCCCAAAAGAATTT^CTCTTCTAGGAGCGGTGGTAGCGAAGTTAAT 
AATAATTTTGGAGATAGGAATTCTAAAGAATATTCGATGAGACTGGATGATTGGGTTCTT 
TGCCGGATTTACAAGAAATCACACGCTTCATTGTCATCACCTGATGTTGCTTTGGTCACA 
AGC^UVTCAAGAGCATGAGGAAAATGACAACGAACCATTCGTAGACCGCGGAACCTTTTTG 
CGAAATTTGCAAAATGATCAACCCCTTAAACGCCAGAAGTCTTCTTGTTCGTTCTCAAAC 
TTACTAGACGCTACAGATTTGACGTTTCTCGCAAATTTTCTAAACGAAACCCCGGAAAAT 
CGTTCTGAATCAGATTTTTCTTTCATGATTGGCAATTTCTCTAATCCTGAC^TTTACGGA 
AACCATTACTTGGATCAGAAGTTACCGCAGTTGAGCTCTCCCACTTCAGAGACAAGCGGC 
ATCGGAAGCAAAAGAGAGAGAGTGGATTTTGCGGAAGAAACGATAAACGCTTCGAAGAAG 
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ATGATGAACACATATAGTTACAATAATAGTATAGATCAAATGGATCATAGTATGATGCAA 

CAACCTAGTTTCCTGAACCAGGAACTCATGATGAGTTCTCACCTTCAATATCAAGGCTAG 

>G2533 Amino Acid Sequence (conserved domain in AA coordinates : 11-186) 

MISKDPISSLPPGFRFHPTOEELILHYLRKK^ 

FGEKEWYFFSPRDRKYPNGARPNRAAASGYWKATGTD 

YRGKPPKGVICTNWIMHEYRI^SLSPKRINSSRSGGSEVNN^ 

CRIYKKSHASLSSPDVALVTSNQEHEENDNEPFVT)RGTFLPNLQNDQPLKRQKSSCSFSN 
LLDATDLTFLANFLNETPENRSESDFSFMIGNFSNPDIYGNHYLDQKLPQLSSPTSETSG 
IGSKRERVT>FAEETINASKKMMNTYS 
>G2534 (1..975) 

ATGGATAATATAATGCAATCGTCAATGCCACCGGGATTCCGATTTCATCCGACAGAGGAA 
GAGCTTGTGGGTTATTACCTAGATAGGAAGATCAATTCAATGAAGAGTGCTTTAGATGTC 
ATTGTAGAGATTGATCTCTACAAAATGGAGCCATGGGATATACAAGCGAGGTGTAAACTA 
GGGTATGAAGAGCAAAACGAGTGGTACTTCTTTAGTCATAAGGACAGGAAGTACCCTACC 
GGGACTAGGACCAACCGAGCCACTGCGGCTGGGTTCTGGAAAGCCACGGGTAGAGACAAG 
GCGGTACTATC^AAAAACAGTGTCATCGGAATGCGGAAGACACTTGTCTACTACTUVGGGT 
CGAGCTCCTAATGGAAGAAAGTCCGATTGGATCATGCACGAATACCGTCTCCAAAACTCC 
GAGCTTGCCCCGGTTCAGGAGGAAGGCTGGGTGGTGTGTCGAGCATTTAGGAAGCCAATT 
CCAAACCAGAGGCCATTAGGGTACGAGCCATGGCAGAACCAGCTCTACCACGTCGA7UVGT 
AGTAACAACTACTCATCTTCAGTGACAATGAACACGAGTCATCATATCGGTGCATCTTCA 
TCAAGTCATAACCTTAATCAAATGCTCATGAGCAATAACCACTACAATCCTAATAATACA 
TCCTCATCGATGCATCAATATGGCAACATTGAGCTCCCGC^GTTGGACAGCCCGAGCTTG 
TCGCCTAGTTTAGGGACGAATAAAGATCAGAACGAGAGTTTCGAGCAAGAAGAAGAGAAG 
AGCTTTAACTGTGTGGATTGGAGAACACTAGATACCTTGCTTGAGACACAAGTCATACAT 
CCGCATAACCCTAATATTCTTATGTTCGAAACGCAGTCGTATAATCCGGCGCCAAGCTTC 
CCTTCCATGCATCAAAGCTATAATGAGGTCGAAGCTAATATTCATCATTCTCTTGGATGC 
TTCCCTGACTCGTAA 

>G2534 Amino Acid Sequence (conserved domain in AA coordinates : 10-157) 

MDNIMQSSMPPGFRFHPTEEELVGYYLDRKINSMKSALDVIVEIDLYKMEPWDIQARCKL 

GYEEQNEWYFFSHKDRKYPTGTRTNRATAAGFWKATGRDKAV^ 

RAPNGRKSDWIMHEYRLQNSELAPVQEEGWWCRAFRKPIPNQRPLGYEPWQNQLYHVES 
SNNYSSSVTMNTSHHIGASSSSIINLNQI^MSNNHYNPNWT 

SPSLGTNKI)QNESFEQEEEKSFNCVX)WRTLDTLLETQVIHPHNPNILMFETQSYNPAPSF 
PSMHQS YNE VEANIHHSLGCFPDS * 
>G2573 (34.. 957) 

CCAGATTTAATTTGAGACTCTCAAAGAAACACCATGGAAGAAGAGCAACCTCCGGCCAAG 

AAACGAAACATGGGGAGATCTAGAAAAGGTTGCATGAAAGGTAAAGGCGGTCCAGAGAAC 

GCCACGTGTACTTTCCGTGGAGTTAGGCAACGGACTTGGGGTAAATGGGTGGCTGAGATC 

CGTGAGCCTAACCGTGGGACTCGTCTCTGGCTCGGCACGTTTAATACCTCGGTCGAGGCC 

GCCATGGCTTACGATGAAGCCGCTAAGAAACTCTATGGACACGAGGCTAAACTCAACTTG 

GTGCACCCACAACAACAACAACAAGTAGTAGTGAACAGAAACTTGTCTTTTTCTGGCCAC 

GGGTCGGGTTCTTGGGCTTATAATAAGAAGCTCGATATGGTTCATGGGTTGGACCTTGGT 

CTCGGCCAGGCAAGTTGTTCACGAGGTTCTTGCTCAGAGAGATCGAGTTTTCTACAAGAA 

GATGATGATCATAGTCATAATCGATGTTCGTCTTCAAGTGGTTCGAATCTTTGTTGGTTA 

TTACCTAAACAAAGTGATTCACAAGATCAAGAGACCGTTAATGCTACGACTAGTTATGGC 

GGTGAAGGCGGTGGTGGCTCTACGTTAACGTTTTCGACCAATTTGAAACCAAAGAATTTG 

ATGAGTCAGAATTATGGATTATACAATGGAGCTTGGTCTAGGTTTCTTGTGGGGCAAGAA 

AAGAAGACGGAACA5GACGTGTCATCGTCGTGTGGATCGTCGGACAACAAGGAGAGTATG 

TTGGTTCCTAGTTGCGGCGGAGAGAGGATGCATAGGCCGGAGTTGGAAGAGCGAACAGGA 

TATTTGGAAATGGATGATCTTTTGGAGATTGATGATTTAGGTTTGTTGATTGGCAAAAAT 

GGAGATTTCAAGAATTGGTGTTGTGAAGAGTTTCAACATCCATGGAATTGGTTCTGAGAG 

TTTTTATTTATTACTATTATTTATCATACATATTTCTTATATTTGACTTAGG 

>G2573 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEEEQPPAKKRNMGRSRKGCMKGKGGPENATCTFRGWQRTWGKWVAEIREPNRGTRLWL 

GTFNTSVEAAMAYDEAAKKLYGHEAKLNLVTIPQQQQQv^^ 

DMVHGLDLGLGQASCSRGSCSERSSFLQEDDDHSHNRCSSSSGSNLCWLLPKQSDSQDQE 
TVNATTSYGGEGGGGSTLTFSTNLKPKNLMSQNYGLYNGAWSRFLVGQEKKTEHDVSSSC 
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GSSDNKESMLVPSCGGERMHRPELEERTGYLEMDDLLEIDDLGLLIGKNGDFKNWCCEEF 
QHPWNWF* 

>G2589 (23.. 1354) 

AAAGAAAAGAAAAATAAAGATAATGAGGACGAAGACTAAGTTAGTACTCATACCTGATAG 
ACACTTTCGGAGAGCCACATTCAGGAAGAGGAATGCAGGGATAAGGAAGAAACTCCACGA 
GCTGACAACTCTCTGTGACATCAAAGCATGTGCGGTAATCTACAGTCCGTTCGAGAATCC 
AACGGTGTGGCCGTCAACCGAAGGTGTTCAAGAGGTGATTTCGGAGTTCATGGAGAAGCC 
GGCGACAGAACGGTCCAAGACGATGATGAGTCATGAGACTTTCTTGCGGGACCAAATCAC 
CAAAGAACAAAACAAACTAGAGAGTCTACGTCGTGAAAACCGAGAAACTCAGCTTAAGCA 
TTTTATGTTTGATTGCGTTGGAGGCAAGATGAGTGAGCAACAGTATGGTGCAAGGGACCT 
TCAAGATTTAAGTCTTTTTACTGATCAATATCTTAATCAGCTTAATGCCAGGAAGAAGTT 
COTTACAGAATATGGTGAGTCTTCTTCTTCTGTTCCTCCTCTGTTTGATGTTGCGGGTGC 
CAATCCTCCTGTTGTTGCAGATCAAGCTGCGGTAACTGTTCCTCCTTTGTTTGCTGTTGC 

TGTTGCGGGTGCCAATCTTCCTGTTGTTGCAGATCAAGCTGCGGTTAATGTTCCTACTGG 

ATTTCATAACATGAATGTGAACC^GAATCAGTATGAGCCGGTrCAGCCCTATGTCCCTAC 

TGGTTTTAGTGATCATATTCAATATCAGAATATGAACTTC^ 

GGTTCATTACCAGGCTCTTGCTGTTGCGGGTGCCGGTCTTCCTATGACTCAGAAT 

TGAGCCCGTTCACTACCAGAGTCTTGCTGTCGCGGGTGGCGGTCTTCCTATGAGTCAGTT 

GCAGTATGAGCCGGTTCAGCCTTATATCCCTACTGTTTTTAGTGATAATGTTCAATATCA 

GCATATGAATTTGTATCAAAATCAACAAGAGCCGGTTCACTACCAAGCTCTTGGTGTTGC 

AGGTGCCGGTCTTCCTATGAATCAGAATCAGTATGAGCCGGTTCAGCCCTATGTCCCTAC 

TGGTTTTAGTGATCATTTTCAGTTTGAGAATATGAATTTGAATCAAAATCAACAGGAGCC 

GGTTCAATACCAAGCTCCTGTTGATTTTAATCATCAGATTCAACAAGGAAACTATGATAT 

GAATTTGAACCAGAATATGAGTTTGGATCC^VATCAGTATCCGTTTCAAAATGATCCATT 

CATGAATATGTTGACAGAATATCCTTATGAATAAGCGGGTTATGTTGGAGAGCATGCAC 

>G2589 Amino Acid Sequence {domain in AA coordinates: TBD) 

MRTKTKLVLIPDRHFRRATFRKRNAGIRKKLHELTTLCDIKACAVIYSPFEOT 

GVQEVISEFMEKPATERSKTMMSHETFLRDQITKEQNKLESLRRENRETQLKHFMFDCVG 

GKMS EQQYGARDLQDLSLFTDQYLNQLNARKKFLTEYGES SS S VPPLFDVAGANPPVVAD 

QAAVTVPPLFAVAGANLPWADQAAVTVPPLFAVA^^ 

QNQYEPVQPYVPTGFSDHIQYQNMNFNQNQQEPVHYQALAVAGAGLPMTQNQYEPVHYQS 
LAVAGGGLPMSQLQYEPVQP YI PTVFSDNVQ YQHMNLYQNQQE PVHYQALGVAGAGLPMN 
QNQYEPVQPYVPTGFSDHFQFENMNLNQNQ 
LDPNQYPFQNDPFMNMLTEYP YE * 
>G2687 (45.. 1139) 

CTCTGTCTCTCGTATCTTTCTACTACTCTGTTTCTTGAATTCTAATGAACAACATCGACG 

ACGCAAAGACGGAGACTTCAGTGTCTTCAGGTTCAAGCGACTCTTTCTTGCCTCTCAAGA 

AACGCATGAGACTTGATGACGAACCAGAAAACGCCCTAGTGGTTTCGTCTTCACCAAAGA 

CGGTTGTGGCTTCTGGCAATGTCAAGTACAAAGGAGTCGTTCAGCAACAGAACGGTCATT 

GGGGTGCCCAGATTTACGCAGACCACAAAAGGATTTGGCTTGGAACTTTCAAATCCGCTG 

ATGAAGCCGCCACGGCTTACGATAGTGCATCTATCAAACTCCGAAGCTTTGACGCTAACT 

CGCACCGGAACTTCCCTTGGTCTACAATCACTCTCAACGAACCAGACTTTCAAAATTGCT 

ACACAACAGAGACTGTGTTGAACATGATCAGAGACGGTTCGTACCAACACAAATTCAGAG 

ATTTTCTCAGAATCAGATCTCAGATTGTTGCGAGTATCAACATCGGGGGACCAAAACAAG 

CCCGAGGAGAAGTGAATCAAGAATCAGACAAGTGTTTTTCTTGCACACAGCTTTTTCAGA 

AGGAATTGACACCGAGCGATGTAGGGAAACTAAATAGGCTTGTGATACCTAAAAAGTATG 

CAGTGAAGTATATGGCTTTCATAAGCGCTGATCAAAGCGAGAAAGAAGAGGGTGAAATAG 

TAGGATCTGTGGAAGATGTGGAGGTTGTGTTTTACGACAGAGCAATGAGACAATGGAAGT 

TTAGGTATTGTTACTGGAAAAGTAGCCAGAGCTTTGTCTTCACCAGAGGATGGAATAGTT 

TCGTGAAGGAGAAGAATCTCAAGGAGAAGGATGTTATTGCCTTCTACACTTGCGATGTCC 

CGAACAATGTGAAGACATTAGAAGGTCAAAGAAAGAACTTCTTGATGATCGATGTTCATT 

GCTTTTCAGACAACGGTTCCGTGGTAGCTGAGGAAGTAAGTATGACGGTTCATGACAGTT 

CAGTGCAAGTAAAGAAAACAGAAAACTTGGTTAGCTCCATGTTAGAAGATAAAGAAACCA 

AATCAGAGGAGAACAAAGGAGGGTTTATG CTGTTTGGTGTAAGGATCGAATGTCCT TAGG 

GAATTTTTCTTTAAAAGTTTCTTACTTCAACTAGAACTTGTTTTACTTGTACCT 

>G2687 Amino Acid Sequence (domain in AA coordinates: TBD) 
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Mll^IDDAKTETSVSSGSSDSFLPLKKRMRLDDEPENALWSSSPKTWASGNVKYKGVVQ 

QQNGHWGAQI YADHKRIWLGTFKSADEAATAYDSAS I KLRSFDANSHRNFPWSTITLNEP 

DFQNCYTTETVLNMIRDGSYQHKFRDFLRIRSQIVASINIGGPKQARGEVNQESDKCFSC 

TQLFQKELTPSDVGKLNRLVIPKKYAVKYMPFISADQSEKEEGEIVGSVEDVEVVFYDRA 

MRQWKFRYCYWKSSQSFVFTRGWNSFVKEKNLKEKDVIAFYTCD^ 

MIDVHCFSDNGSWAEEVSMTVHDSSVQVKKTENLVSSMLEDKETKSEENKGGFMLFGVR 

IECP* 

>G27 (83.. 622) 

CAAAATACCAAAAACAAAACATTTTTTTTAATC 

CGTTACATTAAATTATCTTTAGATGCAAGACTCTTCCTCTCACGAATCGCAACGTAACCT 

CCGGTCACCGGTGCCGGAGAAAACCGGAAAGAGTTCTAAGACTAAAAATGAGCAAAAAGG 

TGTTTCTAAACAACCAAATTTTCGTGGGGTCAGAATGAGACAATGGGGAAAATGGGTGTC 

TGAAATTAGAGAACCAAGAAAGAAATCAAGAATATGGCTCGGTACTTTCTCTACGCCGGA 

GATGGCGGCGCGTGCACACGACGTGGCGGCTTTAGCCATCAAAGGTGGCTCT 

TAATTTCCCGGAGCTAGCTTACCZATTTGCCGAGACCGGCTAGCGCGGACCCTAAAGACAT 

TCAAGAAGCCGCCGCCGCAGCAGCTGCCGTTGACTGGAAAGCACCGGAGTCTCCGTCTAG 

CACCGTGACGTCATCTCC^GTCGCCGACGACGCTTTCTCCGATCTTCCTGATCTTTTGCT 

TGACGTCAATGATCACAACAAAAACGATGGATTCTGGGACTCGTTTCCGTACGAAGATCC 

TTTCTTCTTGGAAAATTACTAGAAGGCAAATTCTTGCCGGCGAACGGATTTTCCGGTGGT 

TTCCCGGTAAATAAGAAGACGATGTCGTTTTGTACCTTTTTTGTCTACGATGGGAAATTT 

CTTTTTTTTTTACGTGTGAGTAAAAGTTTCCGAATGTGTC 

TATTTAATTTCTTTTTTTTGTAC^^ 

GTGCTTTTATCTTCCAAATTCATTAAAAAAAAAAAAAAAAA 

>G27 Amino Acid Sequence (domain in AA coordinates: 37-104) 
MQDS S SHESQRNLRS PVPEKTGKS S KTKNEQKGVSKQPNFRGVRMRQWGKWVSE IREPRK 
KSRIWLGTFSTPEMAARAHDVAALAIKGGSAHIiNFPEJ^AYHLPRPASADPKDIQEAAT^AA 
AAVDWKAPESPSSTVTSSPVADDAFSDLPDLLLDVOTHNK1IDGFWDSFPYE 
>G2720 (1..894) 

ATGGAAGCGAAGAAGGAAGAGATAAAGAAAGGTCCATGGAAAGCCGAAGAAGACGAAGTA 

CTCATCAACCATGTCAAGAGATACGGTCCTCGTGATTGGAGCTCCATTCGATCCAAAGGT 

CTTCTTCAACGCACCGGCAAATCCTGTCGTCTTCGTTGGGTCAATAAACTCCGTCCCAAT 

CTCAAAAATGGATGCAAGTTCTCGGCTGACGAAGAGAGGACTGTGATTGAGTTACAATCT 

GAGTTTGGTAACAAATGGGCGAGAATCGCTACGTATCTACCGGGAAGAACTGATAACGAT 

GTGAAGAATTTCTGGAGTAGCAGACAAAAGAGACTCGCTAGGATTCTTCATAACTCCTCT 

GATGCATCGAGTTCGAGTTTCAATCCCAAATCTTCTTCTTCTCATCGACTCAAGGGCAAA 

AACGTCAAACCAATCCGTCAATCCTCTCAGGGTTTTGGTTTGGTTGAGGAAGAGGTTACA 

GTTTCTTCTTCATGTTCCCAGATGGTTCCTTATTCATCTGATCAAGTTGGTGATGAAGTC 

TTGAGGTTGCCGGATTTGGGTGTTAAGTTAGAGCATCAGCCTTTCGCTTTTGGCACTGAT 

CTTGTCCTAGCAGAGTACTCTGACTCACAGAATGATGCAAATCAGCAAGCAATCAGCCCT 

TTCTCTCC^GAAAGCAGAGAGCTTTTGGCTAGACTTGACGACCCTTTTTACTATGATATA 

CTTGGACCAGCTGATTGTTCTGAGCCATTGTTCGCTCTCCCTCAGCCGTTCTTCGAGCCT 

TCGCCTGTGCCGAGAAGATGCAGACATGTTTCAAAGGATGAAGAAGCTGATGTTTTCTTA 

GACGATTTCCCAGCTGACATGTTTGATCAGGTTGATCCAATCCCAAGTCCTTAG 

>G2720 Amino Acid Sequence (domain in AA coordinates: 10-114) 

meakkeeikkgpwkaeedevlinhvkrygprdwssirskgllq 

lkngckfsadeertvielqsefgnkwariatylpgrtdndvknfwssrqkrlaril™ 

dassssfnpksssshrlkgknvkpirqssqgfglveeevtvssscsqmvpyssdqvgdev 

lrlpdlgvkxehqpfafgtdlvlaeysdsqndanqqaispfspesrellarlddpfyydi 

lgpadsseplfalpqpffepspvprrcrhvskdeeadvflddfpadmfdqvdpipsp* 

>G2787 (142.. 1584) 

TCTGAGAGCAAAAAACAAAAAAAAAGAAAAAAAAACCCTAAATCTAAATCTCACCTTCCA 
CCTCTGTCTTTTTTTTTTTTGTTCTTTTTTTTTTTTTTACTGTATCTTCTCTTCTCTTTG 
CTCTGCAAAAATCTCACATCCATGGATCCATCTCTTGGTGATCCTCATCATCCTCCTCAG 
TTCACCCCTTTTCCTCATTTTCCCACCTCCAATCATCATCCTTTAGGACCAAATCCGTAC 
AATAACCATGTCGTCTTCCAACCGCAGCCGCAAACGCAAACGCAAATCCCGCAACCGCAG 
ATGTTTCAGTTATCTCCACATGTTTCAATGCCCCACCCTCCTTACTCCGAAATGATTTGC 
GCTGCGATTGCGGCGTTAAACGAACCGGATGGTTCGAGCAAGATGGCAATTTCGAGATAC 
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ATCGAGAGATGTTACACCGGTTTAACTTCTC 

AAGACTTTGAAGACCAGTGGTGTTCTTTCTATGGTTAAGAAATCTTACAAAATTGCTGGT 
TCITCTACTCCTCCTGCTAGTGTAGCTGTTGCTGCTGCTGCCGCCGCTCAAGGTCTCGAT 
GTTCCCAGATCTGAGATTCTCCATTC7\AGTAACAACGATCCCATGGCTTCTGGCTCTGCT 
TCTCAGCCTCTGAAACGAGGTCGTGGTCGTCCTCCTAAGCCTAAACCTGAATCTCAACCA 
CAACCACTACAGCAACTTCCACCGACCAATCAAGTCCAGGCTAACGGACAGCCAATCTGG 
GAACAGCAGCAAGTTCAATCACCTGTTCCGGTTCCGACTCCGGTTACAGAGTCGGCGAAG 
AGAGGACCTGGTCGTCCAAGGAAGAACGGTTCTGCTGCTCCTGCTACTGCACCAATCGTT 
CAAGCTTCGGTTATGGCTGGAATTATGAAACGTAGAGGTAGACCACCGGGTCGTCGAGCT 
GCTGGGAGACAGAGGAAGCCCAAATCCGTTTCTTCTACTGCCTCTGTGTATCCTTATGTT 
GCTAATGGTGCTAGACGCAGAGGAAGGCCTAGGAGAGTTGTTGACCCTAGCAGTATTGTT 
AGTGTTGCTCCAGTAGGTGGTGAAAATGTGGCAGCGGTTGCGCCAGGGATGAAGCGTGGA 
CGTGGACGACCACCTAAGATTGGTGGTGTTATCAGTAGGCTTATTATGAAGCCTAAGAGA 
GGACGAGGACGTCCTGTAGGTAGACCCAGAAAGATTGGAACATCAGTCACGACTGGGACA 
CAAGATTCTGGAGAACTCAAGAAGAAGTTTGATATTTT^ 

GTGAAGGTGTTGAAGGATGGAGTTACAAGTGAGAATCAAGCAGTGGTGCAAGCCATAAAA 
GATCTGGAAGCACTAACAGTGACGGAGACCGTTGAGCCACAAGTTATGGAAGAAGTGCAG 
CCAGAGGAGACTGCAGCACCACAGACTGAAGCTCAACAAACTGAAGCTGCTGAGACACAA 
GGAGGACAAGAAGAAGGACAAGAAAGAGAAGGAGAAACACAGACCCAGACAGAAGCAGAG 
GCAATGCAAGAAGCTCTGTTCTGAAGAATAATAATGATCTAGAAAACAACCTAGACATAA 
TAGCCTTGGTGTTTGGCGTTAGGAGTGTTTTTTTTTAGTTGTTTTAGGTGTTGGAATCGC 
ATCTTAAATTATATAAAAATCTATAAGGAATTTTAATTTTTCTAGGTTTTGTTGTCTGCA 
GAAGAAGAAATAGTAGACTCGTTAATGGTGTTGTTGTCGGTGTGTCTTTAACCAAACCAT 
AAGACGTGGCTGTAAATTAGCGATGTTTCTAGTCTTCCATCTTTAATAATCTCTTATTGC 
GTCTGTGCCTTTGTTTTT 

>G2787 Amino Acid Sequence (domain in AA coordinates: 172-192, 226-247, 256-276 
290-311, 245-366) 

^PSLGDPHHPPQFTPFPHFPTSNHHPLGPNPYWNHVVPQPQPQTQTQIPQPQMFQLSPH 

VSMPHPPYSEMICAAIAALNEPDGSSKI^ISRYI^ 

VLSMVKKSYKIAGSSTPPASVAVAAAAAAQGIJDVPRSEII^ 

RGRPPKPKPESQPQPLQQLPPTNQVQANGQPIWEQQQVQSPVPVPTPVTESAKRGPGRPR 

KNGSAAPATAPIVQASVMAGIMKRRGRPPGRRAAGRQRKPKSVSSTASVYPYVANGARRR 

GRPRRVVDPSS IVSVAPVGGENVAAVAPGMKRGRGRPPKIGGVI SRLIMKPKRGRGRPVG 

RPRKIGTSVTTGTQDSGELKKKFDI FQEKVKEI VKVLKDGVTSENQAWQAI KDLEALTV 

TETVEPQVMEEVQPEETAAPQTEAQQTEAAETQGGQEEGQEREGETQTQTEAEAMQEALF 
* 

>G2789 (82. .879) 

CTTTAGGGACACCAAATCTATTCAACCTAAAAGCCTTCTTTTCCCCTATATTGACCAACT 

TTTTAGCGAATCAGAAGAGGAATGGATGAGGTATCTCGTTCTCATACACCGCAATTTCTA 

TCAAGTGATCATCAGCACTATCACCATCAAAACGCTGGACGACAAAAACGCGGCAGAGAA 

GAAGAAGGAGTTGAACCCAACAATATAGGGGAAGACCTAGCCACCTTTCCTTCCGGAGAA ■ 

GAGAATATCAAGAAGAGAAGGCCACGTGGCAGACCTGCTGGTTCCAAGAACAAACCCAAA 

GCACCAATCATAGTCACTCGCGACTCCGCGAACGCCTTCAGATGTCACGTCATGGAGATA 

ACCAACGCCTGCGATGTAATGGAAAGCCTAGCCGTCTTCGCTAGACGCCGTCAGCGTGGC 

GTTTGCGTCTTGACCGGAAACGGGGCCGTTACAAACGTCACCGTTAGACAACCTGGGGGA 

GGCGTCGTCAGTTTACACGGACGGTTTGAGATTCTTTCTCTCTCGGGTTCGTTTCTTCCT 

CCACCGGCACCACCAGCTGCGTCTGGTTTAAAGGTTTACTTAGCCGGTGGTCAAGGTCAA 

GTGATCGGAGGCAGTCTGGTGGGACCGCTTACGGCATCAAGTCCGGTGGTCGTTATGGCA 

GCTTCATTTGGAAACGCATCTTACGAGAGGCTGCCACTAGAGGAGGAGGAGGAAACTGAA 

AGAGAAATAGATGGAAACGCGGCTAGGGCGATTGGAACGCAAACGCAGAAACAGTTAATG 

CAAGATGCGACATCGTTTATTGGGTCGCCGTCGAATTTAATTAACTCTGTTTCGTTGCCA 

GGTGAAGCTTATTGGGGAACGCAACGACCGTCTTTCTAAGATAATATCATTGATAATATA 

AGTTTCGTCTTCTTATTCTTTTTCACTTTTTACCTTTTTCACTTTCTTAGGTTTTGTTTT 

AACGTTTGATTAATACCTGAAGGTTTTTGGAAAATTTTCGATCGGATAAAAGGATTTATG 

TTGCGAGCCGAAACGCGGCC 

>G2789 Amino Acid Sequence (domain in AA coordinates: 53-73, 121-165) 
MDEVSRSHTPQFLSSDHQHYHHQNAGRQKRGREEEGVEPNNIGEDLATFPSGEENIKKRR 
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PRGRPAGSKNKPKAPI I VTRDSANAFRCHVME ITNACDVMESLAVFARRRQRGVCVLTGN 
GAVTNVTVRQPGGGWSLHGRFEILSLSGSFLPPPAPPAASGLKVYLAGGQGQVIGGSW 
GPLTAS S P VWMAAS FGNAS YERLPLEEEEETERE I DGNAARA IGTQTQKQLMQD ATS F I 
GS PSNL INS VSLPGEAYWGTQRPS F * 
>G31 (13. .615) 

CnTTTATAAGCAATGGCTCCAAGACAGGCGAACGGTAGAAGC^TTGCCGTGAGTGAAGGC 
GGCGGAGGGAAGACGATGACGATGACGACGATGCGGAAGGAAGTGCACTTTAGAGGTGTG 
AGGAAGCGTCCATGGGGTAGATACGCGGCGGAGATCCGTGACCCGGGAAAGAAAACCCGG 
GTTTGGCTCGGGACATTCGACACGGCGGAGGAAGCTGCAAGAGCTTACGACACCGCCGCT 
AGAGAGTTTCGTGGCTCCAAAGCAAAGACTAATTTCCCTCTTCCCGGAGAGTCTACTACG 
GTTAACGACGGTGGCGAGAACGATTCTTACGTCAACCGTACGACGGTGACGACGGCGCGT 
GAGATGACGCGTC^GAGATTTCCGTTTGCATGTCACCGGGAGCGTAAAGTCGTCGGTGGT 
TATGCTTCTGCTGGTTTTTTCTTCGATCCGTCAAGAGCTGCTTCGTTAAGAGCAGAGCTT 
TCTCGGGTTTGTCCGGTTCGGTTTGATCCGGTTAATATCGAGTTGAGTATTGGTATTCGA 
GAAACCGTAAAAGTTGAACCGAGAAGAGAACTAAACCTGGATCTTAACCTAGCTCCACCG- 
GTGGTGGACGTTTAGATTTTTTTCTTCTTTT(^TAATTTGTATTTTACATTGCCGGAAAA 
TAATTAATGTTTTCTTTAG 

>G31 Amino Acid Sequence (domain in AA coordinates: TBD) 
MAPRQANGRS I AVSEGGGGKTMTMTTMRKEVHFRGVRKRPWGRYAAE IRDPGKKTRVWLG 
TFDTAEEAARAYDTAAREFRGSKAKTNFPLPGESTTVNDGGENDSYVNRT^ ■ 
QRFPFACHRERKVVGGYASAGFFFDPSRAASLRAELSRVCPVRFDPVNIELSIGIRETVK 
VEPRRELNLDLNLAPPWDV* 
>G33 (20. .757) 

ATTCTCCCCCAACCAAAATATGACCACAGAAAAAGAGAATGTCACTACGGCCGTGGCCGT 

GAAAGACGGCGGAGAAAAGAGTAAGGAAGTGAGTGACAAGGGCGTAAAGAAGAGAAAGAA 

TGTAACTAAGGCCCTGGCCGTGAATGACGGCGGAGAAAAGAGTAAGGAAGTGCGTTACAG 

GGGTGTAAGGAGGAGACCATGGGGGAGATATGCTGCGGAGATCCGTGATCCGGTAAAGAA 

AAAACGGGTCTGGCTCGGGTCCTTCAACACGGGGGAGGAAGCCGCCAGAGCCTACGACTC 

CGCTGCC^TAAGGTTTCGAGGATCGAAAGCTACTACTAACTTCCCTCTAATCGGATACTA 

TGGGATTTCTTCGGCGACGCCGGTGAACAACAACCTTTCCGAGACGGTGAGTGATGGAAA 

TGCCAACCTCCCTCTCGTTGGAGACGATGGGAATGCTTTGGCTTCTCCGGTGAACAACAC 

CCTTTCCGAAACGGCGCGTGATGGAACACTTCCATCGGATTGTCACGACATGTTATCTCC 

GGGGGTGGCTGAAGCGGTTGCTGGATTTTTCTTAGATCTGCCTGAAGTTATTGCGTTGAA 

AGAGGAGCTTGATCGAGTTTGTCCTGACCAGTTTGAGTCCATTGATATGGGGTTGACTAT 

TGGTCCTCAAACCGCCGTGGAAGAGCCTGAGACTTCCTCCGCCGTGGATTGTAAGCTGCG 

AATGGAACCGGATCTTGACCTCAACGCAAGTCCCTAAAGATTGATCTGATGTTGTTGTCC 

TTGAATAAGTTTGTTATCTTGTCGCTCTTCTGATTGTCTGTACTTCTATTGGTTGAT^ 

TGCTTTTGGAGGACAAAACAAACATTTTTTTA 

ATCGAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G33 Amino Acid Sequence (domain in AA coordinates: 50-117) 
MTTEKENVTTAVAVKDGGEKSKEVSDKGVKK^ 

WGRYAAE IR0P VKKKRVWLGSFNTGEE AARAYDS AAIRFRGS KATTNFPL IG YYG I S SAT 
PVNNNLSETVSDGNANLPLVGDDGNALASPVimTLSETARDGTLP^ 
AGFFLD1.PEVIALKEELDRVCPDQFESIDMGLTIGPQTAVEEPETSSAVDCKLRMEPDLD 
IiNASP* 

>G342 (i..723) 

ATGGACGTCTACGGCATGTCTTGACCGGACTTGCTTCGTATCGACGACCTTCTCGATTTC 
TCCAACGACGAAATGTTCTCTTCCTCTTCCACCGTCACTTCCTCCGCCGCTTCCTCCGCC 
GCTTCTTCCGAAAACCCTTTCAGCTTTCCTTCTTCCACCTACACTTCTCCTACTCTCCTC 
ACCGACTTCACTCACGATCTCTGCGTTCCCAGTGACGACGCAGCTCATCTCGAATGGTTA 
TCGCGATTCGTTGACGATTCATTCTCCGATTTCCCAGCAAATCCTTTAACCATGACCGTT 
AGACCCGAGATTTCATTCACCGGAAAACCTAGAAGTCGCCGATCAAGAGCACCAGCACCT 
TCCGTAGCTGGAACTTGGGCTCCGATGTCTGAATCAGAGCTTTGTCACTCCGTCGCTAAA 
CCTAAACCGAAGAAAGTCTACAACGCTGAATCGGTTACGGCGGATGGAGCGAGGCGGTGC 
ACGC^CTGTGCCTCGGAGAAAACGCC^CAGTG 

CTTTGTAACGCTTGTGGAGTTCGTTACAAATCAGGGAGGCTTGTACCGGAATACAGACCG 
GCGTCGAGTCCGACGTTTGTATTGACTCAGCATTCGAACTCTCATCGGAAAGTTATGGAG 



124 



WO 03/013227 



125/286 



PCT/US02/25805 



CTCCGGCGACAGAAGGAACAACAAGAATCTTGCGTTCGAATTCCGCCGTTTCAGCCGCAG 
TAA 

>G342 Amino Acid Sequence (domain in AA coordinates: 155-190) 
MDVYGMSSPDLLRIDDLLDFSNDEIFSSSSTVTSSAASSAASSENPFSFPSSTYTSPTLL 
TDFTHDLCVPSDDAAHLEWLSRFVDDSFSDFPANPLTMTVRPEISFTGKPRSRRSRAPAP 
SVAGTWAPMSESELCHSVAKPKPKJCVYNAESVTADGARRCTHCASEKTPQWR 

LCNACGVRYKSGRLVPEYRPASSPTFVLTQHSNSHRKVMELRRQKEQQESCVRIPPFQPQ 
* 

>G352 (80.. 817) 

AATACACCA(^CACTTCACTCTTTCTTCATCTTCTTCTTCTTAAATAGCTCGAAATCACA 
TCTCACAGAATTAAATCTTATGGCTCTCGAGACTCTCAATTCTCCAACAGCTACCACCAC 
CGCTCGGCCTCTTCTCCGGTATCGTGAAGAAATGGAGCCTGAGAATCTCGAGCAATGGGC 
TAAAAGAAAACGAACAAAACGTCAACGTTTTGATCACGGTCATCAGAATCAAGAAACGAA 
CAAGAACCTTCCTTCTGAAGAAGAGTATCTCGCT^ 

CTCCGCCGTACAATCTCCTCCTCTTCCTCCTCTACCGTCACGTGCGTCACCGTCCGATCA 
CCGAGATTACAAGTGTACGGTCTGTGGGAAGTCCTTTTCGTCATACCAAGCCTTAGGTGG 
ACACAAGACGAGTCACCGGAAACCGACGAACACTAGTATCACTTCCGGTAACCAAGAACT 
GTCTAATAACAGTCACAGTAACAGCGGTTCCGTTGTTATTAACGTTACCGTGAACACTGG 
TAACGGTGTTAGTCAAAGCGGAAAGATT(^CACTTGCTCAATCTGTTTCAAGTCGTTTG^ 
GTOTGGTCAAGCCTTAGGTGGACACAAACGGTGTCACTATGACGGTGGCAACAACGGTAA 
CGGTAACGGAAGTAGCAGCAACAGCGTAGAACTCGTCGCTGGTAGTGACGTCAGCGATGT 
TGATAATGAGAGATGGTCCGAAGAAAGTGCGATCGGTGGCCACCGTGGATTTGACCTAAA 
CTTACCGGCTGATCAAGTCTCAGTGACGACTTCTTAA 

>G352 Amino Acid Sequence (domain in AA coordinates: 99-119,166-186) 

MALETLNSPTATTTARPLLRYREEMEPENLEQWAKRKRTKRQRFDHGHQNQETimJLPSE 

EEYIALCLLMLARGSAVQSPPLPPLPSRASPSDHRDYKCTVCGKSFSSYQALGGHKTSHR 

KPTNTS ITSGNQELSNNSHSNSGS WINVTVNTGNGVSQSGKIHTCS ICFKSFASGQALG 

GHKRCHYDGGNNGNGNGSSSNSVELVAGSDVSDVDNERWSEESAIGGHRGFDLNLPADQV 

SVTTS* 

>G357 (1..615) 

ATGCAGAACAAACACAAATGCAAGCTCTGTTCCAAGAGTTTCTGTAATGGCAGAGCACTT 
GGTGGTCACATGAAGTCTCACTTGGTCTCATCTCAGTCTTCAGCTCGGAAGAAACTAGGT 
GACTCGGTCTATTCTTCTTCTTCCTCTTCCTCCGATGGTAAAGCGCTCGCCTACGGGTTA 
CGAGAGAACCCGAGGAAGAGTTTCCGGGTCTTTAATCCGGATCCTGAGTCATCCACAATT 
TACAACAGTGAGACAGAGACCGAACCTGAATCCGGAGACCCGGTTAAGAAACGGGTCAGA 
GGAGATGTTTCAAAGAAGAAGAAGAAGAAGGCAAAGAGTAAGAGAGTGTTTGAGAACTCG 
AAGAAGCAAAAGACAATTCACGAGTCACCAGAACCAGCGAGTTCTGTCTCTGATGGTTCT 
CCTGAACAAGATTTAGCTATGTGCTTGATGATGCTGTCAAGAGATTCAAGGGAGCTCGAG 
ATTAAACTGAAAAAACCGGAGGAAGAGAGGAAGCCGGAAAAAAGACATTTCCCTGAGCTC 
CGTCGCTGTATGATAGATCTGAATCTTCCTCCGCCGCAAGAAGCTGAAGCTGTCACCGTC 
GTTTCAGC CATATAA 

>G357 Amino Acid Sequence (domain in AA coordinates: 7-29) 

MQNKHKCKLCSKSFCNGRALGGHMKSHLVSSQSSARKKLGDSVYSSSSSSSDGKALAYGL 

RENPRKS FRVFNPDPES STIYNSETETEPESGDPVKKRWGDVSKKKKKKAKSKRVFENS 

KKQKTIHESPEPASSVSDGSPEQDLAMCLMMLSRDSRELEIKLKKPEEERKPEKRHFPEL 

RRCMIDLNLPPPQEAEAVTWSAI* 

>G358 (1..855) 

ATGGGTCAAGATGAGGTTGGGAGTGATCAGACGCAAATCATAAAAGGGAAACGTACGAAG 
CGACAAAGATCGTCTTCGACGTTTGTGGTGACGGCGGCGACAACAGTGACTTCAACAAGT 
TCATCGGCCGGTGGAAGTGGAGGAGAAAGAGCTGTTTCAGATGAATACAACTCGGCGGTT 
TCGTCTCCGGTGACTACTGATTGTACGCAAGAAGAAGAAGACATGGCGATTTGTCTCATC 
ATGTTAGCTCGTGGGACAGTTCTTCCATCGCCGGATCTCAAGAACTCGAGAAAAATTCAT 
CAGAAGATTTCGTCGGAGAATTCTAGTTTCTATGTGTACGAGTGTAAAACGTGTAACCGG 
ACGTTTTCGTCGTTCCAAGCACTTGGTGGACACAGAGCGAGCCACAAGAAGCCGAGGACG 
TCGACTGAGGAAAAGACTAGACTACCCCTGACGCAACCCAAGTCTAGTGCATCAGAAGAA 
GGGCAAAACAGTCATTTCAAAGTTTCCGGCTCAGCCCTAGCTTCACAGGCAAGTAACATC 
ATCAACAAGGCAAACAAAGTACACGAGTGTTCCATCTGCGGTTCTGAGTTCACTTCCGGG 
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CAAGCTCTCGGTGGTC^CATGAGGCGGCACAGGACAGCCGTAACCACGATTAGCCCCGTT 
GCAGCCACCGCAGAAGTAAGCAGAAACAGTACAGAGGAAGAGATTGAGATCAATATAGGC 
CGTTCGATGGAACAGCAGAGGAAATATCTACCGTTGGATCTTAATCTACCAGCACCAGAA 
GATGATCTAAGAGAGTCAAAGTTTCAAGGGATAGTATTCTCAGCAACACCAGCGTTAATA 
GATTGTCATTACTAG 

>G358 Amino Acid Sequence (domain in AA coordinates: 124-135, 188-210) 

MGQDEVGSDQTQIIKGKRTKRQRSSSTFVWAATTVTSTSSSAGGSGGERAVSDEYNSAV 

SSPVTTDCTQEEEDMAICLINULARGTO 

TFSSFQALGGHRASHKKPRTSTEEKTRLPLTQPKSSASEEGQNSHFKVSGSALASQASNI 
INKANKVHECSICGSEFTSGQALGGHMRRHRTAVTTISPVAATAEVSRNSTEEEIEINIG 
RSMEQQRKYLPLDLNLPAPEDDLRESKFQGIVFSATPALIDCHY* 
>G360 (1..543) 

ATGTGGAAC CC TAACAAAATTGAAGAATTGGAGGATGATC 
GCCTTTGAGC^GACACTAAAGGCAACATCTCTGGTACCACTTG 

ACTTGCAATTTCTGCCGCCGTGAGTTCCGTTCTGCTCAAGCCTTAGGCGGTCACATGAAT 

GTCCACCGCCGTGACCGCGCCTCATCTAGGGCTCATCAAGGTTCCACCGTTGCGGCTGCG 

GCTAGAAGCGGCCACGGGGGGATGTTACTCAATOCTTGTGCTCCGCCGTTGCCTACAACG 

ACACTTATAATACAATC(^CGGCGAGTAACATTGAAGGTTTGTCCCATTTCTACCAACTG 

CAAAACCCTAGTGGCATTTTTGGTAATTCTGGTGACATGGTGAATCTTTATGTAGAAGTT 

CCTCCTCGGCTTATTGAATATTCGACAGGAGATGATGAGAGCATTGGCTCGATG7^AAGAA 

GCGACAGGAACATCAGTGGATGAGCTITOATCTTGAACTTCGGCTAGGGCACCATCCACC 

TGA 

>G360 Amino Acid Sequence (domain in aa coordinates: 42-62) 
MWNPNKI EELEDDDES V^VKAFEQDTKGNI SGTTWPPRS YTCNFCRREFRSAQALGGHMN 
VHRRDRASSRAHQGSTVAAAARSGHGGMLLNSC^PLPTTTLIIQSTASNIEGLSHFYQL 
QNPSGIFGNSGDMVNLYVEVPPRLIEYSTGDDESIGSMKEATGTSVDELDLELRLGHHPP 
* 

>G362 (195.. 830) 

ATAAAAAACCCTTCATACAATATAAAATTTCTTTAGACATACAATATATTATACTATTAC 
AGATGCAATGCATCATTAGTTACAAACTATTAAACTAAATATCCCCCGTCTCTCTCTTGC 
TATATAAAGAAGATCATTTACACATCTC CTTAAGCAAATTAAAC CCATCG ATAAACACAT 
ACGTTCACACATATATGTCTATAAATCCGACAATGTCTCGTACTGGCGAAAGTTCTTCAG 
GTTCGTCCTCCGACAAGACGATAAAGCTATTCGGCTTCGAACTCATCAGCGGCAGTCGTA 
CGCCGGAAATCACGACGGCGGAAAGCGTGAGCTCGTCCACAAACACGACGTCGTTAACAG 
TGATGAAAAGACACGAGTGCCAATACTGCGGTAAAGAGTTTGCAAATTCTCAAGCCTTAG 
GAGGTCACCAAAACGCTCAGAAGAAGGAGAGGTTGAAGAAGAAGAGGCTTCAGCTTCAAG 
CTCGGCGAGCCAGCATCGGCTATTATCTCACCAACCACCAACAACCCATAACGACGTCAT 
TTCAGAGACAATACAAAACGCCGTCGTATTGTGCATTCTCCTCCATGCACGTGAATAATG 
ATCAGATGGGTGTGTACAACGAAGATTGGTCGTCGAGGTCGTCGCAGATTAACTTCGGTA 
ATAATGACACGTGCCAAGATCTTAATGAAC7U\AGCGGTGAGATGGGTAAGCTGTACGGTG 
TTCGACCGAACATGATTCAGTTCCAGAGAGATCTGAGTTCTCGTTCTGATCAGATGAGAA 
GTATTAACTCGCTGGATCTTCATCTAGGTTTTGCCGGAGATGCGGCATAACAAATTAAAG 
AGAGATATATGATTAAGATTATATGTACTATAGTGGCGTATTTCATTGGGATCATGAAGG 
GGAAAAAACGAGACATATAGTATTCTTGATGC^TTTGAGTTTTGTAATTTATTTAGGTO 
TATGTATGTTTTCGAAG 

>G362 Amino Acid Sequence (domain in AA coordinates: 62-82) 

MSINPTMSRTGESSSGSSSDKTIKLFGFELISGSRTPEITTAESVSSSTNTTSLTVMKRH 

ECQYCGKEFANSQALGGHQNAHKKERLKKKRLQLQARRASIGYYLTNHQQPITTSFQRQY 

KTPSYCAFSSMHVNNDQMGVYNEDWSSRSSQINFGNNDTCQDLNEQSGEMGKIiYGVRPNM 

IQFQRDLS SRSDQMRS INSLDLHLGFAGDAA* 

>G364 (64. .516) 

AAGCTTGATATCGCCTCTCTCTAATCTCTCTTTCTCTCTCTATCTCTAAGAATATATAAA 
GGTATGGACTACC^GCCAAACACATCCCTACGTCTAAGCCTACCAAGTTACAAGAACCAC 
CAACTAAACCTAGAACTTGTTCTCGAGCCTTCTTCCATGTCTTCTTCTTCATCTTCTTCC 
ACGAACTCATCATCATGTTTGGAGCAGCCTAGGGTATTCTCATGTAACTATTGTCAAAGA 
AAGTTTTACAGCTCTCAAGCTCTTGGTGGTC^^ 

TTAGCCAAGAAGAGTCGAGAACTCTTTAGATCCTCAAACACTGTTGATTCTGATCAGCCT 
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TACCCGTTCTCCGGTCGCTTTGAGCTTTACGGCCGTGGCTACCAAGGATTTCTCGAAAGT 
GGCGGCTCGAGGGACTTCTCCGCCCGCCGTGTGCCGGAGAGTGGTCTTGATCAGGATCAG 
GAGAAGAGTCACCTTGACTTATCCTTAAGGCTCTAAAAGAATCTTATATTTTGTTAGTCT 
ATATATTATCATATCAATTGTTAATCTTAAAATTGATTGTTTTAC 

TATTATCTGAAAGTTTTCTTTGTAAGTTGTAACTATGGTCCTAAATTCAAATCCAAATTO 
GATTTTGG AAGATGGTAC CTAATG CAGTAGTTAAATAAGTTAAAAAAATGAAGGATCTAT 
AATTCTCT 

>G364 Amino Acid Sequence (domain in AA coordinates: 54-76) 

MDYQPNTSLRLSLPSYKNHQLNLELVLEPSSMSSSSSSSTNSSSCLEQPRVFSCNYCQRK 

FYSSQALGGHQNAHKLERTLAKKSRELFRSSNTVDSDQPYPFSGRFELYGRGYQGFLESG 

GSRDFSARRVPESGLDQDQEKSHLDLSLRL* 

>G365 (69. .755) 

CAATTCTTTTACTTTCATTCTCTTTATATATTCTCTCTACGCTATAATATATATTACACA 
GAATATACATGGAACCGTCCATCAAAGGAGATCAAGAAATGTTAAAAATCAAGAAACAAG 
GTCATCAAGATCTTGAGTTGGGGTTGACCCTTT^ 

AGCTCAATCTCATCGATTCTTTCAAAACCAGCTCATCATCGACTTCTCATCATCAGCACC 
AGCAAGAACAATTGGCAGATCCGAGAGTGTTCTCGTGTAATTATTGTCAAAGAAAGTTCT 
ATAGTTCACAAGCGCTAGGCGGTCACCAAAACGCTCATAAACGTGAGCGCACCTTAGCCA 
AACGTGGACAGTATTACAAGATGACTCTCTCCTCCTTGCCTTCTTCAGCGTTTGCGTTTG 
GCCACGGTTCAGTCAGCAGATTCGCAAGCATGGCATCGTT^ 

ATAAC^GGTCAACGTTAGGGATTCAAGCTCATTC^CGATCCATAAGCCCAGCTTCTTAG 
GAAGACAAACGACGAGTTTAAGTCATGTTTTCAAACAGAGCATTCACCAGAAACCGACCA 
TAGGA^GATGTTGCCGGAGAAATTTCACCTTGAAGTCGCCGGAAATAATAACAGTAACA 
TGGTTGCTGCTAAGTTGGAGAGAATTGGACATTTCAAGAGCAACCAAGAAGATCATAATC 
AGTTTAAGAAAATTGACTTGACTCTTAAGCTATGAGCTCTGCCATCTTCTTTTTAGTCTT 
CATTATAACTTTTTTTATTCTCATCTTTGTTTGATATAATGATTGACGGCAGGGTGTGTT 
AGAGTTTCACTAATGATCAAGTTGTACTTTTTATATATTTCATTGATACCTTGTTGATGT 
AATTCAATATTTTAGGTCTGTTTTT 

>G365 Amino Acid Sequence (domain in aa coordinates: 70-90) 

MEPSIKGDQEMLKIKKQGHQDLELGLTLLSRGTATSSELNLIDSFKTSSSSTSHHQHQQE 

QLADPRVFS CNYCQRKF YS S QALGGHQNAHKRERTLAKRGQ YYKMTLS SLPS S AFAFGHG 

SVSRFASMASLPLHGSVNNRSTLGIQAHSTIHKPSFLGRQTTSLSHVFKQSIHQKPTIGK 

MLPEKFHLEVAGNNNSN1WAAKLERIGHFKSNQEDHNQFKKIDLTLKL* 

>G367 (1. .708) 

ATGGACGCTTCAATAGTXTCCTCATCCACTGCTTTTCCATATCAAGATTCTCTAAACCAG 

AGC^TCGAAGACGAAGAAAGAGACGTTCATAATTCTAGTCACGAACTCAATCTCATCGAC 

TGCATAGACGACACAACGAGTATCGTTAACGAATCTACAACATCCACAGAACAAAAGCTT 

TTCTCATGCAACTATTGTCAAAGAACTTTCTATAGCTCAC^G(^CTTGGTGGT(^CCAA 

AACGCACACAAGAGAGAGAGAACGTTGGCGAAGAGAGGACAACGTATGGCAGCGTCAGCC 

TCAGCTTTTGGACATCCTTACGGTTTCTCTCCACTTCCTTTCCACGGACAGTACAACAAC 

CATAGGTCTTTAGGGATCCAAGCGCATTCGATAAGCCACAAGCTAAGTTCTTATAACGGG 

TTTGGTGGTCACTATGGTCAGATCAACTGGTCAAGACTTCCATTTGATCAACAACCAGCC 

ATAGGTAAATTTCCCTCAATGGATAATTTTCATCATCATCATCATCAGATGATGATGATG 

GCTCCTTCAGTAAATTCACGGTCCAATAACATCGATAGCCCAAGCAACACAGGACGGGTT 

CTAGAAGGGTCACCGACTCTTGAACAATGGCACGGAGACAAAGGATTGTTGTTAAGTACA 

AGTCATCATGAAGAGCAGCAGAAACTTGACTTGTCCCTCAAGCTTTGA 

>G367 Amino Acid Sequence (domain in AA coordinates : 63-84) 

MDASIVSSSTAPPY^DSLNQSIEDEERDVHNSSHELNLIDCIDDTTSIWESTTSTEQKL 

FSC^CQRTFYSSQALGGHQNAHKRERTLAKRGQRMAASASAFGHPYGFSPLPFHG 

HRSLGIQAHSISHKLSSYNGFGGHYGQINWSRLPFDQQPAIGKFPSMDNFHHHHHQMMm 

APSVNSRSNNIDSPSNTGRVLEGSPTLEQWHGDKGLLLSTSHHEEQQKLDLSLKL* 

>G373 (1. .1854) 

ATGGCGATTGAAACTCAGCTTCCTTGCGACGGTGACGGTGTGTGTATGCGGTGTCAGGTG 
AATCCTCCGTCAGAAGAGACTCTCACTTGTGGCACGTGCGTCACTCCATGGCACGTGCCG 
TGTCTCCTCCCCGAATCACTCGCTTCTTCCACTGGAGAGTGGGAGTGTCCCGATTGCTCC 
GGCGTTGTCGTTCCCTCCGCCGCTCCGGGTACCGGAAACGCTCGACCTGAATCTTCCGGT 
TCAGTTCTCGTTGCTGCGATCCGTGCGATTCAGGCTGATGAGACTTTAACCGAAGCTGAG 
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AAAGCCAAAAAAAGGCAGAAACTGATGAGTGGGGGTGGTGACGATGGTGTCGATGAAGAA 
GAGAAGAAGAAGTTAGAAATCTTTTGTTCTATTTGCATTCAATTGCCAGAAAGACCTATC 
ACGACACCGTGTGGGCAGAATTTCTGTTTGAAATGTOT 

GGGAAGCTAACTTGTATGATATGCCGAAGCAAAATTCCGAGACATGTGGCAAAAAATCCT 
CGCATCAACTTAGCTCTAGTTTCTGCTATTCGTTTAGCAAATGTTACCAAATGTTCTGTT 
GAGGCAACTGCAGCCAAGGTTCATCATATTATCCGCAACCAAGACCGTCCTGAGAAAGCA 
TTTACTACCGAGCGGGCAGTAAAAACTGGGAAAGCTAATGCTGCTAGCGGTAAGTTTTTT 
GTGACAATACCTCGTGATCATTTTGGTCCCATACCAGCTGAGAATGATGTCACTAGAAAG 
CAAGGTGTTTTGGTTGGAGAATCTTGGGAGGACAGGCT^GAG 

CATTTCCCGCATATTGCTGGCATTGCCGGGCAATCAGCGGTTGGAGCTCAGTCTGTGGCC 

CTCTCTGGAGGTTATGACGATGATGAGGATCATGGTGAATGGTTTCTCTACACAGGAAGT 

GGTGGAAGGGATCTCAGTGGAAACAAAAGAATTAACAAGAAACAGTCGTCTGACCAGGCG 

TTTAAAAACATGAATGAATCTCTAAGACTTAGTTGCAAAATGGGCTATCCTGTCCGAGTT 

GTCAGGTCTTGGAAGGAGAAGCGTTCTGCATATGCCCCTGCTGAAGGTGTGAGATATGAT 

GGGGTCTATCGAATTGAGAAGTGCTGGAGTAATGTTGGAGTACAGGGTTCTTTTAAGGTC 

TGTCGTTACCTGTTTGTTAGATGTGACAATGAGCCAGCTCCATGGACCAGTGATGAGCAT 

GGCGATCGTCCAAGACCGTTGCCTAATGTTCCGGAGCTTGAGACTGCTGCTGACCTGTTT 

GTGAGAAAGGAGAGTCCATCATGGGATTTCGATGAAGCTGAGGGTCGTTGGAAATGGATG 

AAGTCTCCTCCTGTTAGCAGAATGGCTTTGGATCCTGAGGAGAGGAAGAAGAATAAGAGA 

GCAAAAAATACTATGAAGGCCAGACTTCTGAAAGAATTTAGTTGCCAAATCTGTCGGG^ 

GTGCTGAGTCTTCCAGTGACGACGCCTTGTGCACACAACTTCTGCAAAGCATGCTTAGAA 

GCGAAGTTTGCTGGGATAACTCAACTGAGAGAGAGAAGCAATGGCGGACGTAAACTACGT 

GCAAAGAAGAACATCATGACCTGCCCTTGCTGCACGACGGATCTCTCCGAGTTTCTCCAA 

AACCCGCAGGTGAACAGAGAGATGATGGAGATAATAGAGAATTTTAAGAAGAGTGAGGAA 

GAGGCTGATGCATCCATTTCTGAAGAAGAAGAAGAAGAATCCGAACCTCCAACTAAGAAG 

ATTAAGATGGATAACAACTCTGTTGGTGGTAGTGGTACAAGTCTCTCAGCTTAA 

>G373 Amino Acid Sequence (domain in AA coordinates: 129-168) 

MAIETQLPCDGDGVCMRCQVNPPSEETLTCGTCVTPWHVPCLIjPESLASSTGEWECPDCS 

GVVVPSAAPGTGNARPESSGSVLVAAIi^IQADETLTEAEKAKKRQKLMSGGGDDGVDEE 

EKKKLEIFCSICIQLPERPITTPCGHNFCLKCFEKWAVGQGKLTCMICRSKZPRHVAKNP 

RINIiALVS AIRLANVTKCSVEATAAKVHHI I RNQDRPEKAFTTERAVKTGKANAAS GKFF 

VTIPRDHFGPIPAENDVTRKQGVLVGESWEDRQECRQWGAHFPHIAGIAGQSAVGAQSVA 

LSGGYDDDEDHGEWFL YTGSGGRDLS GNKRINKKQS SDQAFKNMNESLRL S CKMGYPVRV 

VRSWKEKRS AYAPAEGVRYDGVYRI EKCWSOTGVQGS FKVCRYLFVRCDNEPAPWTSDEH 

GDRPRPLPNVPELETAADLFVRKESPSWDFDEAEGRWKWMKSPPVSRMALDPEERKKNKR 

AKNTMKARLLKE F S CQ I CREVLSLPVTTPCAHNFCKACLEAKFAG ITQLRERSNGGRKLR 

AIOCMIMTCPCCTTDLSEFLQNPQVNREMMEIIENFKKSEEEADASISEEEEEESEPPTKK 

IKMDNNSVGGSGTSLSA* 

>G396 (1..957) 

ATGGGGGAAAGAGATGATGGGTTGGGTTTGAGTCTAAGCTTGGGAAATAGTCAACAAAAA 

GAACCATCTCTGAGGTTGAATCTTATGCCGTTGACAACTTCTTCTTCTTCTTCTTCGTTT 

CAACACATGCACAATCAGAATAACAATAGCCATCCCCAGAAGATTCATAACATCTCTTGG 

ACTCATCTGTTTCAATCTTCTGGGATTAAACGTACAACTGCAGAGAGAAACTCCGACGCC 

GGGTCATTTCTAAGAGGTTTCAACGTGAACAGAGCTCAGTCTTCGGTGGCGGTAGTGGAC 

TTGGAAGAAGAAGCCGCCGTCGTCTCGTCTCCAAACAGCGCCGTTTCGAGTCTGAGTGGA 

AATAAAAGGGATCTTGCGGTGGCGAGAGGAGGAGATGAAAACGAGGCGGAGAGAGCTTCT 

TGCTCACGCGGAGGGGGAAGCGGTGGTAGCGACGATGAAGACGGCGGAAACGGCGACGGA 

TCAAGGAAGAAACTACGGTTATCGAAGGATCAAGCTCTTGTTCTCGAGGAGACTTTTAAA 

GAACATAGCACTCTTAATCCGAAGCAAAAGCTGGCTCTAGCAAAACAGTTGAATCTAAGG 

GCAAGACAAGTTGAAGTGTGGTTTCAGAACCGTAGGGCAAGGACGAAGCTGAAACAAACG 

GAGGTTGATTGTGAGTATTTAAAGAGATGTTGCGATAATCTGACCGAGGAGAATCGACGG 

CTGCAGAAAGAAGTGTCGGAGCTGAGGGCGTTGAAGTTGTCTCCACATCTCTACATGCAC 

ATGACTCCTCCTACTACTCTCACCATGTGCCCTTCTTGCGAACGTGTCTCCTCCTCTGCC 

GCCACTGTGACCGCTGCTCCTTCCACTACTACTACTCCTACGGTGGTGGGGCGGCCAAGT 

CCACAGCGATTAACTCCTTGGACTGCTATTTCTCTCCAGCAAAAATCAGGTCGCTAG 

>G396 Amino Acid Sequence (domain in AA coordinates: 159-220) 

MGERDIX3LGLSLSLGNSQQKEPSLRLNLMPLTTSSSSSSFQHMHNQNNNSHPQKIHNISW 
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THLFQSSGIKRTTAERNSDAGSFLRGFNVNRAQSSVAVVDLEEEAAVVSSPNSAVSSLSG 
NKIU)LAVARGGDENEAERASCSRGGGSGGSDDEIX5GNGDGSRKKLRLSKDQALVLEETFK 
EHSTLNPKQKLALAKQLNLRARQ^ 

LQKEVSELRALKLS PHL YMHMTPPTTLTMCPS CERVS S S AATVTAAPSTTTTPTVVGRPS 

PQRLTPWTAISLQQKSGR* 

>G431 (1..1149) 

ATGGAGAGTGGTTCCAACAGCACTTCITGTCCAATGGCTTTTGCCGGGGATAATAGTC 
GGTCCGATGTGTCCTATGATGATGATGATGCCGCCCATCATGACATCACATCAACATCAT 
GGTCATGATCATCAACATCAACAACAAGAACATGATGGTTATGCATATCAGTCACACCAC 
CAACAAAGTAGTTCCCTTTTTCTTCAATCACTAGCTCCT 

GTTGCTTCTTCTTCTTCTCCTTCCTCTTGTGCTCCTGCCTATTCTCTAATGGAGATCCAT 

CATAACGAAATCGTTGCAGGAGGAATCAACCCTTGCTCCTCTTTCTCTTCTTCAGCCTCT 

GTCAAGGCCAAGATCATGGCTCATCCTCACTACCACCGCCTCTTGGCCGCTTATGTCAAT 

TGTCAGAAGGTTGGAGCACCACCGGAGGTTGTGGCGAGGCTGGAGGAGGCATGCTCGTCT 

GCCGCAGCCGCAGCCG(^TCTATGGGGC<^UVC^GGGTGTCTTGGTGAAGATCCAGGGCTT 

GATCAATTCATGGAAGCTTACTGTGAAATGCTCGTTAAGTATGAGCAAGAGCTCTCCAAA 

CCTTTCAfitGGAAGCTATGGTCTTCCTTCAACGTGTCGAGTGTCAATTCAAATCCCTCTCT 

CTATCCTCACCTTCCTCTTTCTCCGGTTATGGAGAGACAGCAATTGATAGGAACAATAAT 

GGGTCATCCGAGGAAGAAGTCGATATGAACAATGAATTTGTAGATCCIACAAGCTGAGGAT 

AGAGAGCTTAAAGGACAGCTCTTGCGC^GTACAGTGGTTACTTAGGGAGCCTCAAGCAA 

GAGTTCATGAAGAAGAGGAAGAAAGGAAAGCTCCCTAAAGAAGCTCGTC^CAACTGCTT 

GATTGGTGGAGCCGTCACTACAAATGGCCTTACCCTTCGGAGCAACAAAAGCTCGCCCTT 

GCGGAATCAACGGGGCTGGACCAGAAACAGATAAACAATTGGTTCATAAACCAGAGGAAA 

CGGCATTGGAAGCCGTCGGAGGACATGC^GTTTGTAGTAATGGACGCAACACATCCTCAC 

CATTACTTCATGGATAATGTCTTGGACAATCCTCT 

ATGCTTTGA 

>G431 Amino Acid Sequence (domain in AA coordinates: 286-335) 

MESGSNSTSCPMAFAGDNSDGPMCPMMMMMPPIMTSHQHHGHDHQHQQQEHDGYAYQSHH 

QQSSSLFLQSLAPPQGTKNKVASSSSPSSCAPAYSLMEIHHNEIVAGGINPCSSFSSSAS 

VKAKIMAHPHYHRLLAAYWCQKVG2VPPEWARLEEACSSAAAAAASMGPTGCXjGEDPGL 

DQFMEAYCEMLVKYEQELSKPFKEAMVFLQRVECQFKSLSLS S PS SFSGYGETAIDRIINN 

GSSEEEVDMNNEFVDPQAEDRELKGQLLRKYSGYL^^ 

DWWSRHYKWPYPSEQQKLALAESTGI^QKQINNWFINQRKRH 

HYFMDNVLDNPFPMDHI S STML * 

>G479 (1..1128) 

ATGGAGATGGGTTCCAACTCGGGTCCGGGTCATGGTCCGGGTCAGGCAGAGTCGGGTGGT 
TCCTCCACTGAGTCATCCTCTTTCAGTGGAGGGCTCATGTTTGGCCAGAAGATCTACTTC 
GAGGACGGTGGTGGTGGATCCGGGTCTTCTTCCTCAGGTGGTCGTTCAAACAGACGTGTC 
CGTGGAGGCGGGTCGGGTCAGTCGGGTCAGATACCAAGGTGCCAAGTGGAAGGTTGTGGG 
ATGGATCTAACCAATGCAAAAGGTTATTACTCGAGACACCGAGTTTGTGGAGTGCACTCT 
AAAACACCTAAAGTCACTGTGGCTGGTATCGAACAGAGGTTTTGTCAACAGTGCAGCAGG 
TTTCATCAGCTTCCGGAATTTGACCTAGAGAAAAGGAGTTGCCGCAGGAGACTCGCTGGT 
CATAATGAGCGACGAAGGAAGCCACAGCCTGCGTCTCTCTCTGTGTTAGCTTCTCGTTAC 
GGGAGGATCGCACCTTCGCXTTACGAAAATGGTGATGCTGGAATGAATGGAAGCTTTCTT 
GGGAACCAAGAGATAGGATGGCCAAGTTCAAGAACATTGGATACAAGAGTGATGAGGCGG 
CCAGTGTCGTCACCGTCATGGCAGATCAATCCAATGAATGTATTTAGTCAAGGTTCAGTT 
GGTGGAGGAGGGACAAGCTTCTCATCTCCAGAGATTATGGACACTAAACTAGAGAGCTAC 
AAGGGAATTGGCGACTCAAACTGTGCTCTCTCTCTTCTGTCAAATCCACATCAACCACAT 
GACAACAACAACAACAACAACAAGAACAGCAACAACAACAACAATACATGGCGAGCTTCT 
TCAGGTTTTGGCCCGATGACGGTTACAATGGCTCAACCACCACCTGCACCTAGCCAGCAT 
CAGTATCTGAACCCGCCTTGGGTATTCAAGGACAATGATAATGATATGTCTCCTGTTTTG 
AATTTAGGTCGATACACCGAGCCAGATAATTGTC^GATAAGTAGTGGCACGGCAATGGGT 
GAGTTCGAGTTATCTGATCACCATCATCAAAGTAGGAGACAGTACATGGAAGATGAGAAC 
ACAAGGGCTTATGACTCTTCTTCTCACCATACCAACTGGTCTCTCTGA 

>G479 Amino Acid Sequence (conserved domain in AA coordinates : 70 -14 9) 

MEMGSNSGPGHGPGQAESGGSSTESSSFSGGLMFGQKIYFEDGGGGSGSSSSGGRSNRRV 

RGGGSGQSGQIPRCQV^GCGMDLTNAKGYYSRHRVCGVHSKTPKVTVAGIEQRFCQQCSR 
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FHQLPEFDLEKRSCRRRLAGHNERRRKPQPASLSVLASRYGRIAPSLYENGDAGMNGSFL 

GNQEIGWPSSRTLDTRVMRRPVSSPSWQINPMNW 

KGIGDSNCALSLLSNPHQPHDNNNNN^^ 

QYLNPPWWKDITONDMSPVLNLGRYTEPDNCQISSGTAMGEFELSDHH^ 

TRAYDSSSHHTNWSL* 

>G546 (1. .588) 

atgactcgaccgtcaagattacttgagacggcggcgccaccaccacaaccgtcggaggag 
atgatcgcagcggaatccgacatggtggtgatcttgtcggctcttctttgcgctcttatc 
tgcgttgctggtctcgccgccgtcgtacgatgcgcttggctccggcggtttacagccgga 
ggagattcgccgtcaccgaacaaaggcttgaaaaagaaagctcttcagtctcttccaaga 
tccactttcaccgccgcggaatcaacctccggcgccgccgctgaagagggagactcgacg 
gaatgtgctatttgcctcactgacttcgccgacggtgaagaaataagagtgcttcctctt 
tgtggtcattctttccacgtggagtgtattgacaaatggctagtttctaggtcttcttgt 
ccttcttgtcgcaggattcttacgccggtgagatgtgaccggtgtggtcatgcttctacg 
gcggagatgaaagatcaagctcatcgtcatcaacatcaccaacactcttctactaccatt 
cctacgtttcttccttaa 

>G546 Amino Acid Sequence (domain in AA coordinates : 114-155) 
MTRPSRLLETAAPPPQPSEEMIAAESDMWILSALLCALICVAGLAAVVRCAWLRRFTAG 
GDSPSPNKGLKKKALQSLPRSTFTAAESTSGAAAEEGDSTECAICLTDFADGEEIRVLPL 
CGHSFHVECIDKWLVSRSSCPSCRRILTPVRCTRCGHASTAEMKDQAHRHQHHQHSSTTI 
PTFLP* 

>G551 (1..708) 

ATGGAGTGGTCAACAACGAGCAACGTAGAAAACGTGAGAGTAGCTTTCATGCCACCGCCA 
TGGCCGGAGTCTAGTTCCTTTAACTCGCTCCACAGCTTCAACTTTGATCCTTACGCAGGA 
AATTCATATACGCCTGGCGATACACAAACCGGACCGGTTATCTCTGTACCGGAATCAGAA 
AAGATCATGAATGCGTACCGATTTCCGAACAACAACAATGAGATGATAAAAAAGAAGAGA 
CTAACGAGTGGACAATTAGCTTCACTTGAGCGAAGTTTTCAAGAAGAGATCAAATTAGAT 
TCAGACAGGAAGGTGAAGCTGTCGAGAGAGCTCGGTCTGCAGCCACGTCAGATAGCAGTT 
TGGTTCCAAAACCGCCGTGCACGGTGGAAGGCGAAGCAGCTTGAGCAGTTGTACGACTCG 
CTTAGACAAGAGTACGACGTCGTTTCTAGGGAGAAACAAATGTTACACGATGAGGTGAAG 
AAGCTGAGAGCTTTACTAAGAGACCAGGGTTTGATCAAGAAGCAAATCTCTGCCGGGACC 
ATCAAAGTTTCCGGTGAGGAAGACACGGTGGAGATTTCATCGGTGGTGGTAGCTCATCCA 
AGAACGGAGAATATGAACGCAAATCAAATCACCGGAGGGAATCAAGTTTACGGTCAATAC 
AACAATCCGATGCTGGTTGCTTCCTCTGGCTGGCCGTCATACCCCTGA 

>G551 Amino Acid Sequence (conserved domain in AA coordinates : 73-133) 

MEWSTTSimaiTWVAFMPPPWPESSSFNSmSFNFDPYAGNSYTPGDTQTGPVISVPES^ 

KIIWAYRFPNNmtfEMIKKKRIiTSGQLASLER 

WFQNRRARWKAKQLEQLYDSLRQEYDWSREKQMLHDE VKKLRALLRDQGLI KKQI SAGT 
I KVSGEEDTVE I SS VVVAHPRTENMNANQITGGNQVYGQYNNPMLVAS SGWPS YP * 
>G578 (1..978) 

ATGCATAGTTTGAATGAAACAGT7VATTCCTGATGTTGATTACATGCAGTCTGATAGAGGG 
CATATGCATGCTGCTGCCTCTGATTCCAGTGATCGATCAAAGGATAAGTTGGATCAAAAG 
ACCCTTCGTAGGCTTGCTCAAAATCGTGAGGCAGCAAGAAAAAGCAGATTGAGGAAGAAG 
GCGTATGTTCAGCAGCTGGAAGATAGTCGATTAAAGCTGACTCAAGTTGAGCAGGAGCTG 
CAAAGAGCAAGACAGCAGGGAGTTTTCATCTCAAGTTCAGGAGACCAAGCTCATTCTACT 
GGTGGCAATGGTGGGGCTTTGGCATTTGATGCAGAACACTCACGATGGCTTGAAGAAAAG 
AACAGGCAAATGAACGAGCTGAGATCTGCCCTGAATGCTCATGCAGGTGATACTGAGCTC 
CGGATAATTGTGGATGGAGTGATGGCTCACTATGAGGAGCTTTTCAGGATTAAGAGCAAT 
GCATCTAAGAATGATGTCTTCCACTTGTTATCTGGAATGTGGAAAACACCAGCTGAGCGA 
TGTTTCTTGTGGCTTGGCGGGTTCCCGTCATCCGAACTTCTCAAGCTTCTTGCGAATCAG 
CTAGAGCCCATGACAGAACGACAGGTAATGGGCATCAATAGCTTGCAGCAGACGTCGCAG 
CAGGCAGAAGATGCTTTATCTCAAGGGATGGAGAGTTTACAGCAATCCCTAGCTGATACT 
TTATCCAGTGGAACTCTTGGTTCCAGTTCATCGGATAATGTCGCGAGCTACATGGGTCAG 
ATGGCCATGGCAATGGGC^UVGTTAGGCACCCTCGAAGGATTCATACGCCAGGCTGATAAC 
TTGAGGCTGCAAACACTACAACAGATGCTTCGAGTATTAACAACACGTCAGTCAGCTCGT 
GCTCTTCTTGCTATACACGATTATTCATCTCGATTACGTGCTCTTAGTTCCTTGTGGCTT 
GCCCGGCCAAGAGAGTGA 
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>G578 Amino Acid Sequence (domain in AA coordinates 36-96) 
MHSLNETVIPDVDYMQSDRGHMHAAA 

AYVQQLEDSRLKLTQVBQELQRARQQGVFISSSGDQAHSTGGNGGALAFDAEHSRWLEEK 
NRQMNELRSALNAHAGDTELRIIVDGVMAHYEELFRIKSNA^ 

CFLWLGGFPSSELLKLLANQLEPMTERQVMGINSLQQTSQQAEDALSQGMESLQQSLADT 
LS SGTLGS S S SDNVAS YMGQMAMAMGKLGTLEGF IRQADNLRLQTLQQMLRVLTTRQS AR 
ALLAIHDYSSRLRALSSLWLARPRE* 
>G596 (168.. 1121) 

TAATTTCTCTACTTCAGATTTTTTTCTCCTTAGATC 

CCTCAAGCTAAGATTCTGGTTTTGTGAGTTGAGTGGATGAGAAGAGGAGAGATTAACTAA 
ATTAGGGTTTCAATTGTTTACTTTTTGTTTGCTTTTT 

CTCGCTCTCTTCCTCC^CCTTTTCTCTCAAGAGATCTCCATCTTCACCCACACCATCAAT 
TCCAGCATCAGCAGCAGCAGCAGCAACAGAATCACGGCCACGATATAGACCAGCACCGAA 
TCGGTGGGCTAAAACGTGACCGAGATGCTGATATCGATCCCAACGAGCACTCTTCAGCCG 
GAAAAGATCAAAGTACTCCTGGCTCCGGTGGAGAAAGCGGCGGCGGAGGAGGAGGAGATA 
ATCACATCACGAGAAGGCCACGTGGCAGACCAGCGGGATCTAAGAACAAACCAAAACCGC 
CAATCATCATCACTCGAGACAGCGCAAACGCTCTCAAATCTCATGTCATGGAAGTAGCAA 
ACGGATGTGACGTCATGGAAAGTGTC^CCGTCTTCGCTCGCCGTCGCCAACGTGGC^TCT 
GCGT1TTGAGCGGAAACGGCGCCGTTACCAACGTTACCATAAGACAACCAGCTTCAGTAC 
CTGGTGGTGGCTCATCTGTCGTTAACTTACACGGACGTTTCGAGATTCTTTCTCTCTCGG 
GATCATTCC^ITCCTCCTCCGGCTCCACCAGCTGCGTCAGGTCTAACGATTTACTTAGCCG 
GTGGTCAGGGACAGGTTGTTGGAGGAAGCGTGGTTGGTCCACTCATGGCTTCAGGACCTG 
TAGTGATTATGGCAGCTTCGTTTGGAAACGCTGCGTATGAGAGACTGCCGTTGGAGGAAG 
ACGATCAAGAAGAGCAAACAGCTGGAGCGGTTGCTAATAATATCGATGGAAACGCAACAA 
TGGGTGGTGGAACGCAAACGCAAACTCAGACGGAGCAGCAA 

AAGATCCGACGTCGTTTATACAAGGGTTGCCTCCGAATCTTATGAATTCTGTTCAATTGC 
CAGCTGAAGCTTATTGGGGAACTCCGAGACCATCTTTCTAAATCGCGAAGAAAAAACAAG 
TTAGATACGTTCGTTGTTTTTAATTTATAATCTCTCTTCTGTCAAGTTTTAATTTTCTTT 
TTCTTCTTCTTTG1TTTCTAAAGATAATTGTAGTCTTTGACGAAGATTCGTGGTACGTAT 
GAATCGAAGAGAATCGTTTTGGTCATGGGATTGCTCGATCTATTAGGTTTGAGAGGGGGT 
TTGTGTTTTGCGTTGACTAGCAGATTATAAAATTGTTGATTTTCGAGTTTTTATTTTCAT 
GTGTTGGTGATAAA 

>G596 Amino Acid Sequence' (domain in AA coordinates: 89-96) 

MDQVSRSLPPPFLSRDLHLHPHHQFQHQQQQQQQNHGHDIDQHRIGGLKRDRDADIDPNE 

HSSAGKDQSTPGSGGESGGGGGGDNHITRRPRGRPAGSKNKPKPPI I ITRDSANALKSHV 

MEVANGCDVMESVTVFARRRQRGICVLSGNGAVTNVTIRQPASVPGGGSSVVNLH 

LSLSGSFLPPPAPPAASGLTIYLAGGQGQWGGSWGPLMASGPWIMAASFGNAAYERL 

PLEEDDQE^QTAGAVANNIDGNATMGGGTQTQTQTQQQQQQQLMQDPTSFIQGLPPNLMN 

SVQLPAEAYWGTPRPSF* 

>G617 (59..1141) 

CAGATCTGTTCTTTACACCAAATTGAGTACTGAAGATCTTGTTGAGTGAATTAAAGAGAT 
GAGATCAGGAGAATGTGATGAAGAGGAGATTCAAGCAAAGCAAGAAAGAGATCAAAATCA 
AAATCATCAAGTAAACTTAAACCACATGTTGCAACAACAACAGCCGAGTTCGGTATCATC 
TTCAAGGCAATGGACTTCAGCTTTTAGGAATCCAAGAATCGTTCGAGTCTCAAGAACATT 
CGGTGGCAAAGACAGACACAGCAAAGTATGTACAGTCCGTGGTCTTCGAGACCGGAGGAT 
AAGGTTGTCCGTACCTACAGCTATTCAACTCTACGACCTTCAAGATCGATTAGGGCTGAG 
TCAGCCAAGCAAAGTCATTGATTGGTTACTCGAAGCAGCAAAAGATGACGTAGACAAGCT 
ACCTCCTCTACAAXTCCCACATGGATTTAACCAGATGTATCCAAATCTCATCTTCGGAAA 
CTCCGGGTTTGGAGAATCTCCATCTTCAACTACATCAACAACGTTTCCAGGAACCAATCT 
CGGGTTCTTGGAAAATTGGGATCTTGGTGGTTCTTCAAGAACAAGAGCAAGATTAACCGA 
TACAACTACGACCCAAAGAGAAAGTTTTGATCTTGATAAAGGAAAATGGATCAAAAACGA 
CGAGAATAGTAATCAAGATCATCAAGGGTTTAACACCAATCATCAACAACAATTTCCTCT 
GACCAATCCGTACAACAACACTTCAGCTTATTACAACCTTGGACATCTTCAACAATCGTT 
AGACCAATCTGGTAATAACGTTACTGTCGCAATATCTAATGTTGCTGCTAATAATAACAA 
TAATCTCAATTTGCATCCTCCTTCCTCGTCTGCCGGAGATGGATCTCAGCTTTTTTTCGG 
TCCTACTCCTCCGGCAATGAGCTCTCTATTCCCGACATACCCTTCGTTTCTTGGAGCTTC 
TCATCATCATCATGTCGTCGATGGAGCCGGTCATCTTCAGCTCTTTAGCTCGAATTCAAA 
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TACCGCATCGCAGCAAC^CATGATGCCGGGTAATACGAGTTTGATTAGACCATTTCATCA 
TTTGATGAGCTCGAATCATGATACGGATCATCATAGTAGCGATAATGAATCAGATTCTTG 
AATGATTTTATATATCTACACTATACATTGAAAATGTTATATGTATACGTATTC 
ATTTTGATATATATGCGTATTGTTGGATTGGTTTATGTATCT 

>G617 Amino Acid Sequence (domain in AA coordinates: 64-118) 

MRSGECDEEEIQAKQERDQNQNHQVNLNHMLQQQQPSSVSSSRQWTSAFRNPRIVRVSRT 

FGGKDRHSKVCTVRGLRDRRIRLSVPTAIQLYDLQDRLGLSQPSKVIDWLLEAAKDDVDK 

LPPLQFPHGFNQMYPNLIFGNSGFGESPSSTTSTTFPGTNLGFLENWDLGGSSRTRARLT 

DTTTTORESFDLDKGKWIKNDENSNQDHQGFNTNHQQQFPLTOT 

LDQSGNNVTVAISNVAANNNNNLNLHPPSSS 

SHHHHVVDGAGHLQLFSSNSNTASQQHMMPGNTSLIRPFHHLMSSNHDTDHHSSDNESDS 
* 

>G620 (40.. 666) 

GAATTGAACTTGGACCAGCACAGCAACAACCCAACCC 

GCCGGCGCCGGTGACAAGAACAATGGTATCGTGGTCCAGCAGCAACCACCATGTGTGGCT 

CGTGAGCAAGACCAATACATGCCAATCGCAAACGTCATAAGAATCATGCGTAAAACCTTA 

CCGTCTCACGCCAAAATCTCTGACGACGCCAAAGAAACGATTCAAGAATGTGTCTCCGAG 

TACATCAGCTTCGTGACCGGTGAAGCCAACGAGCGTTGCCAACGTGAGCAACGTAAGACC 

ATAACTGCTGAAGATATCCTTTGGGCTATGAGCAAGCTTGGGTTCGATAACTACGTGGAC 

CCCCTCACCGTGTTCATTAACCGGTACCGTGAGATAGAGACCGATCGTGGTTCTGCACTT 

AGAGGTGAGCCACCGTCGTTGAGACAAACCTATGGAGGAAATGGTATTGGGTTTCACGGC 

CCATCTCATGGCCTACCTCCTCCGGGTCCTTATGGTTATGGTATGTTGGACCAATCCATG 

GTTATGGGAGGTGGTCGGTACTACCAAAACGGGTCGTCGGGTCAAGATGAATCCAGTGTT 

GGTGGTGGCTCTTCGTCTTCCATTAACGGAATGCCGGCTTTTGACCATTATGGTCAGTAT 

AAGTGAAGAAGGAGTTATTCTTCATTTTTATATCTATTCAAAACATGTGTTTCGATAGAT 

ATTTTATTTTTATGTCTTATCAATAACATTTCTATATAATGTTGCTTCTTTAAGGAAAAG 

TGTTGTATGTCAATACTTTATGAGAAACTGATTTATATATGCAAAT 

>G620 Amino Acid Sequence (domain in AA coordinates: 20-118) 

MTSSVIVAGAGDK^GIWQQQPPCVAREQDQYMPIANVIRIMRKTLPSHAiaSDDAKET 

IQECVSEYISFVTGEANERCQREQRKTITAEDILWAMSK^ 

TDRGSALRGEPPSLRQTYGGNGIGFHGPSHGLPPPGPYGYGlVnaDQSM^GGGRyYQNGSS 

GQDESSVGGGSSSSINGMPAFDHYGQYK* 

>G625 (151.. 1137) 

AATCGACCATTCACAACGATGACATTCAAACACTCTTCAGTTTCCCTTCCTTCTTGATTC 
GTCCTCTCCACTATTTTTCTCAATTTCTTTAATCTCTCTCTTTCTCTCTCTACTTCCTCT 
TCCTCTTCTTCTTCTTCTTCTTCTTCATCTATGGACCCTTTAGCTTCCCAACATCAACAC 
AACCATCTGGAAGATAATAACCAAACCCTAACCCATAATAATCCTCAATCCGATTCCACC 
ACCGACTCATCAACTTCCTCCGCTCAACGCAAACGCAAAGGCAAAGGTGGTCCGGACAAC 
TCCAAGTTCCGTTACCGTGGCGTTCGACAAAGAAGCTGGGGCAAATGGGTCGCCGAGATC 
CGAGAGCCACGTAAGCGCACTCGCAAGTGGCTTGGTACTTTCGCAACCGCCGAAGACGCC 
GCACGTGCCTACGACCGGGCTGCCGTTTACCTATACGGGTCACGTGCTCAGCTCAACTTA 
ACCCCTTCGTCTCCTTCCTCCGTCTCTTCCTCTTCCTCCTCCGTCTCCGCCGCTTCTTCT 
CCTTCCACCTCCTCTTCCTCCACTCAAACCCTAAGACCTCTCCTCCCTCGCCCCGCCGCC 
GCCACCGTAGGAGGAGGAGCCAACTTTGGTCCGTACGGTATCCCTTTTAACAACAACATC 
TTCCTTAATGGTGGGACCTCTATGTTATGCCCTAGTTATGGTTTTTTCCCTCAACAACAA 
CAACAACAAAATCAGATGGTCCAGATGGGACAATTCCAACACCAACAGTATCAGAATCTT 
CATTCTAATACTAACAATAACAAGATTTCTGACATCGAGCTCACTGATGTTCCGGTAACT 
AATTCGACTTCGTTTCATCATGAGGTGGCGTTAGGGCAGGAACAAGGAGGAAGTGGGTGT 
AATAATAATAGTTCGATGGAGGATTTGAACTCTCTAGCTGGTTCGGTGGGTTCGAGTCTA 
TCAATAACTCATCCACCGCCGTTGGTTGATCCGGTATGTTCTATGGGTCTGGATCCGGGT 
TATATGGTTGGAGATGGATCTTCGACCATTTGGCCTTTTGGAGGAGAAGAAGAATATAGT 
C^TAATTGGGGGAGTATTTGGGATTTTATTGATCCCATCTTGGGGGAATTCTATTAATTT 
GTTTTTGTGGAAGATCATATTATATACGATGAGCATCCCTAAGGTCGGTCAAGAGCATTG 
GAGATTCATTGTTGAGAGGAATCAAAGAGATTGCATTCTATGAGGAGCTCTGCATGCAAA 
ATTTTGGAGGATTTTTTTACTACCTATAGAGATAAATAAGAGGGTATTTTTATTATTTTT 
TTGAAGATTTTTATTTTCAAGGAATTCGTAAAAGAGATTACGGTTCCAATAAAGTATGTA 
TATGTGGAAGAGAATCGGAGGAGATGGTGGAAAGTTGTATGGGAATTTTATTGGTTCAAC 
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ACTTCCTTCACAGTGTGCCTACCTTAATATATAATTATTGATAGGATATGATAATTTCTG 

>G625 Amino Acid Sequence (conserved domain in AA coordinates : 52-119) 

^PLASQHQHJ^LEDNNQTLTHITOPQSDST 

RSWGKWAEIREPRKRTRKWLGTFATAEDAARAYDRAAVYLYGSRAQLN^ 

SSSSVSAASSPSTSSSSTQTLRPLLPRPAAATVGGGANFGPYGIPFNNNIFLNGGTSNILC 

PS YGFFPQQQQQQNQMVQMGQFQHQQ YQNLHSNTNNNKI SDI ELTDVPVTNSTSFHHEVA 

LGQEQGGSGCNimSSMEDLNSLAGSVGSSLSITHPPPLVDPVCSMGLDPGYMVGDGSSTI 

WPFGGEEEYSHNWGSIWDFIDPILGEFY* 

>G658 (17.. 757) 

CCACGCGTCCGCTCACATGAACAAAGGAGCTTGGACTAAAGAAGAAGATCAGCTTCTTGT 

TGATTACATCCGTAAACACGGTGAAGGTTGCTGGCGATCTCTCCCTCGCGCCGCTGGATT 

ACAAAGATGTGGTAAGAGTTGTAGATTGAGATGGATGAATTATCTAAGACCAGATCTCAA 

AAGAGGCAATTTTACTGAAGAAGAAGATGAACTCATCATCAAGCTCCATAGCTTGCTCGG 

TAACAAATGGTCTTTAATAGCTGGGAGATTACCAGGAAGAACAGATAACGAGATCAAGAA 

CTATTGGAACACTCATATCAAGAGGAAGCTTCTCAGCCGTGGGATTGATCCAAACTCTCA 

CCGTCTGATCAACGAATCCGTCGTGTCTCCGTCGTCTCTTCAAAACGATGTCGTTGAGAC 

TATACATCTTGATTTCTCTGGACCGGTTAAACCGGAACCGGTGCGTGAAGAGATTGGTAT 

GGTTAATAATTGTGAGAGTAGTGGAACGACGTCGGAGAAGGATTATGGGAACGAGGAAGA 

TTGGGTGTTGAATTTGGAACTCTCTGTTGGACCGAGTTATCGGTACGAGTCGACTCGGAA 

AGTGAGTGTTGTTGACTCGGCTGAGTCGACTCX5ACGGTGGGGTTCCGAGTTGTTTGGAGC 

TC^TGAGAGTGATGCGGTGTGTTTGTGTTGTCGGATTGGGTTGTTTCGTAATGAGTCGTG 

TCGGAATTGTCGGGTTTCTGATGTTAGAACTCATTAGAGAGTCAATCGAGAATTCTTTAG 

GAATCTTTTTATATATTTAGATCGTCAATTGTGTTTTTTTTTTGTTCAC^ 

AACATCAAGTAAGAAACTAGCATAATTATTTGATGGCAAAGCCAAAAGATTGTGCTC 

>G658 Amino Acid Sequence (domain in AA coordinates: 2-105) 

MNKGAWTKEEDQLLVDYI RKHGEGCWRS LPRAAGLQRCGKS CRLRWMNYLRPDLKRGNFT 

EEEDELIIKLHSLLGNIOWSLIAGRLPGRTDNEIKNYWNTHIKRKLLSRGIDPNSHRLI^ 

SVVSPSSLQNDVVETIHLDFSGPVKPEPVREEIGMVl^CESSGTTSEKDYG 

ELSVGPSYRYESTRKVS\A^SAESTRRWGSELFGAHESDAVCLCOlIGLFRNESCRNCRV 

SDVRTH* 

>G716 (271.. 2079) 

AAAAAAAAAGGGGAGAGATTTAGTTTTATCCl^CAGNGCCTGAANTACGTTCTGCAATCA 
ANACGGACATAACCGNCCGTTGTGTCCTGTTTATAAAGTTTTGCTTTTTTTATTTTCTCC 
ANTGATGGGTCTTTTCTTTCTTCTCTCTCTNGTGTTTCTTTCATGGGGTTAAGACTAGTG 

TTCTTCTCCAGTTCTCATCTGGGTTCTTCAATGGCGAGTGTTGAAGGTGATGATGATTTC 
GGAAGTTCTTCGTCAAGGTCTTATCAAGATCAACTATACACAGAGCTATGGAAAGTTTGT 
GCAGGTCCATTAGTGGAAGTTCCTCGTGCTCAAGAGAGAGTTTTCTACTTCCCTCAGGGT 
CACATGGAACAACTTGTGGCGTCAACTAATCAAGGAATCAATTCAGAAGAAATACCTGTT 
TTTGATCTTCCTCCAAAGATACTTTGTCGAGTTCTTGATGTCACTTTAAAGGCGGAGCAT 
GAAACAGATGAGGTTTACGCTCAGATCACATTACAACCAGAGGAAGATCAAAGTGAACCA 
AC^GTCTTGATCCACCTATTGTTGGACCA^ 

ATTTTAACGGCTTCAGATACAAGCACTCATGGTGGATTCTCTGTTCTTCGTAAACACGCC 
ACTGAATGCTTGCCTTCTTTGGATATGACACAAGCTACTCCTACTCAAGAACTTGTGACT 
AGAGATCTTCATGGCTTTGAATGGAGGTTTAAGCATATATTCAGAGGACAACCACGGAGG 
CATTTGCTTACTACGGGTTGGAGTACATTTGTATCCTCGAAAAGACTTGTAGCTGGAGAT 
GCTTTTGTGTTCTTGAGGGGTGAGAATGGGGATTTACGGGTTGGAGTGAGACGATTAGCT 
CGGCATCAAAGCACftATGCCTACTTCGGTTATTTCAAGTCAGAGCATGCATTTGGGAGTT 
CTTGCTACAGCTTCTC7VTGCTGTGCGTACAACAACAATCTTTGTTGTCTTTTACAAGCCT 
AGGATAAGCCAATTCATAGTTGGGGTGAACAAGTATATGGAAGCTATAAAGCATGGATTT 
TCTCTCGGTACCCGATTCAGAATGAGGTTTGAAGGAGAAGAGTCTCCTGAGAGAATATTT 
ACTGGTACGATTGTGGGAAGTGGAGATCTATCTTCACAATGGCCAGCTTCTAAATGGAGG 
TCATTGCAGGTACAATGGGATGAGCCAACAACAGTTCAGAGACCAGATAAAGTCTCACCA 
TGGGAGATAGAGCCTTTCTTGGCAACTTCCCCAATTTCAACTCCTGCTCAACAACCACAA 
TCGAAATGCAAGCGGTCAAGACCCATCGAGCCATCAGTTAAAACACCAGCCCCACCTAGT 
TTCTTGTACAGCCTCCCTCAGAGCCAAGATTCCATTAATGCATCCCTTAAACTGTTTCAA 
GATCCATCACTTGAGAGAATTTCAGGTGGATACTCCTCAAACAACAGCTTCAAACCCGAG 
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ACTCCTCCTCCTCCAACGAATTGTAGCTATAGGTTGTTTGGATTTGATCTCACAAGCAAT 
TCTCCTGCTCCAATCCCTCAAGACAAGCAACCGATG 

CAAGAACCCATCACTCCAACCTCAATGAGTGAGCAGAAGAAGCAACAAACATCAAGAAGT 
CGAACTAAAGTGCAAATGCAAGGC^TTGCGGTTGGTCGTGCGGTTGATTTAACACTGTTG 
AAATCTTACGATGAACTGATTGATGAGCTTGAGGAGATGTTTGAGATTCAAGGACAGCTT 
CTTGCCCGAGACAAATGGATCGTTGTCTTCACTC^^ 

GGTGATGATCCGTGGAATGAGTTTTGCAAGATGGCAAAGAAGATATTTATATAT^ 

GATGAGGTTAAGAAAATGACAACGAAACTGAAGATTTCTTCGTCGTTAGAGAATGAGGAA 

TATGGTAATGAATCATTCGAAAATCGTAGTAGGGGGTGAGAGTTTTAGCTGTTAATTAAG 

GTTAATTCGGCGACGTCGTTTTAGTGCGTAAGTGTCTAAAGACTTTTTTTTTAGTCTGTG 

TATATAAAGTCTTGTCCTCTTTTTC7VTGTCAATTT 

GTTITGGGACAGTGGTTGATGGGGCGGTTTTACA 

AAACCATTCAATTTTCAAA 

>G716 Amino Acid Sequence (domain in AA coordinates: 24-355) 
MASVEGDDDFGSSSSRSYQDQLYTELWKVCAGPLVEVPI^QERVFYFPQGHMEQLVASTN 
QGINSEEIPVFDLPPKILCRVIiDVTLKAEHETDEVYAQITLQPEEDQSEPTSLDPPIVGP 
TKQEFHSFVKILTASDTSTHGGFSVLRKHATECLPSLDMTQATPTQELVTRDLHGFEWRF 
KHIFRGQPRRHLLTTGWSTFVSSKRLVAGDAFVFLRG 

I S SQSMHLGVLATASHAVRTTTI FWFYKPRI SQFI VGVNKYMEAIKHGFS LGTRFRMRF 

EGEESPERIFTGTIVGSGDLSSQWPASKWRSLQVQWDEPTTVQRPDKVSPWEIEPFLATS 

PISTPAQQPQSKCKRSRPIEPSVKTPAPPSFLYSLPQSQDSINASLKLFQDPSLERISGG 

YSSNNSFKPETPPPPTNCSYRLFGFDLTSNSPAPIPQDKQPMDTCGAAKCQEPITPTSMS 

EQKKQQTSRSRTKVQMQGIAVGRAVDLTLLKSYDELIDELEEMFEIQGQLLARDKWIVVF 

TDDEGDMMLAGDDPWNEFCKMAKKIFIYSSDEVKKOT 

RG* 

>G725 (46.. 1122) 

CCTCTTTCAGAGAGAGAAAGAGAGTCAGAGAGAGAGAGAGAGAGAATGTTCCATGCTAAG 
AAACCTTCAAGTATGAATGGTTCATATGAGAACAGAGCTATGTGCGTTCAAGGCGATTCA 
GGCCTTGTCCTCACCACCGACCCTAAACCGCGTTTGCGTTGGACCGTCGAACTCCACGAG 
CGTTTTGTGGACGCCGTCGCTCAGCTCGGCGGCCCCGACAAAGCGACCCCAAAGACGATT 
ATGAGAGTTATGGGTGTGAAGGGTCTTACTCTTTACCACCTAAAGAGCCATCTTCAGAAA 
TTCAGGCTTGGAAAGCAGCCGCACAAGGAGTACGGAGATCACTCCACAAAGGAAGGTTCA 
AGAGCTTCTGCCATGGATATTCAGCGCAACGTAGCTTCTTCTTCTGGCATGATGAGTCGC 
AACATGAATGAGATGCAAATGGAAGTGCAGAGAAGGTTGCATGAACAGCTAGAGGTGCAA 
AGACATCTGCAACTGAGGATTGAAGCACAAGGAAAGTACATGCAATCTATCTTGGAGAGA 
GCTTGCCAAACCCTAGCCGGTGAGAACATGGCAGCCGCCACCGCAGCAGCCGCCGTCGGA 
GGAGGATACAAGGGTAATCTGGGAAGTTCGAGTCTTTCAGCAGCGGTGGGCCCACCTCCT 
CATCCTCTTAGTTTCCCGCCGTTTCAAGACCTAAACATCTATGGAAACACAACCGACCAA 
GTCCTCGACCATCACAACTTCC^TCATCAAAACATAGAGAACCATTTCACGGGTAACAAT 
GCTGGAGACACC^CATTTACTTGGGGAAGAAGCGACCTAATCCTi\ATTTTGGTAACGAT 
GTAAGGAAAGGACTATTGATGTGGTCTGATCAAGATCACGATCTTTCCGCAAACCAATCG 
ATCGATGATGAGCATAGAATTCAGATACAGATGGCTACACATGTCTCCACGGATTTGGAT 
TCTTTGTCGGAGATCTACGAAAGGAAATCAGGTTTATCAGGTGATGAAGGGAATAATGGT 
GGGAAATTACTGGAAAGGCeATCGCCTAGGAGATCACCATTGAGTCCTATGATGAACCCT 
AATGGTGGATTAATACAAGGAAGAAACTCGCCATTTGGGTGATACAATTTATTAATTTTT 
ATCTATGAGTGATGCATGGGAATGTAAGAACGAGATATATATGTTTTGTCATTGTGAGTT 
TGACGTAGGGTTTAGAGAAAA 

>G725 Amino Acid Sequence (domain in AA coordinates: 39-87) 
MFHAKKPSSMNGSYENRAMCVQGDSGLVLTTDPKPR 

TPKTIMRVMGVKGLTLYHLKSHLQKFRLGKQPHKEYGDHSTKEGSRASAMDIQRNVASSS 
GMMSRNMNEMQMEVQRRLHEQLEVQRHLQL 

AT^VGGGYKGl^GSSSLSAAVGPPPHPLSFPPFQDLNIYGNTTDQVLDHHNFHHQNIENH 
FTGNNAADTNIYLGKKRPNPNFGNDTOKGLLMWSDQDHDLSANQSIDDEHRIQIQMATHV 
STDLDSLSEIYERKSGLSGDEGNNGGKLLERPSPRRSPLSPMMNPNGGLIQGRNSPFG* 
>G727 (43.. 1977) 

GGAAGAGGACCCGATTCGGGTACTGCTGCTGGTGGGTCAAACTCCGACCCGTTTCCTGCG 
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AATCTTCGAGTTCTTGTCGTTGATGATGATCCAACTTGTCTCATGATCTTAGAGAGGATG 

CTTATGACTTGTCTCTACAGAGAGCAGAGAGCGCATTGTCTCTGCTTCGGAAGAACAAAG 

AATGGTTTTGATATTGTCATTAGTGATGTTCATATGCCTGACATGGATGGT^ 

CTTGAACACGTTGGTTTAGAGATGGATTTACCTGTTATCAATCTGAATGTTTTGAAACCT 

TTGGTTATAGTGATGTCTGCGGATGATTCGAAGAGCGTTGTGTTGAAAGGAGTGACTCAC 

GGTGCAGTTGATTACCTCATCAAACCGGTACGTATTGAGGCTTTGAAGAATATATGGCAA 

CATGTGGTGCGGAAGAAGCGTAACGAGTGGAATGTTTCTGAACATTCTGGAGGAAGTATT 

GAAGATACTGG CGGTGACAGGGACAGGCAG CAGC AGCATAGGG AGG ATGCTGATAACAAC 

TCGTCTTCAGTTAATGAAGGGAACGGGAGGAGCTCGAGGAAGCGGAAGGAAGAGGAAGTA 

GATGATCAAGGGGATGATAAGGAAGACTCATCGAGTTTAA 

TCTGTTGAATTGCATCAGCAGTTTGTTGCTGCTC 

TTAAAAACTTGCTTGCTTATGCATTTGTGTGTGTCGATTGGTAACATTGTGGAATTCCAG 
AAGTATCGGATATATCTGAGACGGCTTGGAGGAGTATCGCAACACCAAGGAAATATGAAC 
CATTCGTTTATGACTGGTCAAGATCAGAGTTTTGGACCTCTTTCTTCGTTGAATGGATTT 
GATCTTCAATCTTTAGCTGTTACTGGTCAGCTCCCTCCTCAGAGCCTTGCACAGCTTCAA 
GCAGCTGGTCTTGGCCGGCCTACACTCGCTAAACCAGGGATGTCGGTTTCTCCCCTTGTA 
GATC7VGAGAAGCATCTTCAACTTTGAAAACCCAAAAATAAGATTTGGAGACGGACATGGT 
CAGACGATGAACAATGGAAATTTGCTTC^TGGTGTCCC^CGGGTAGTCACATGCGTCTG 
CGTCCTGGACAGAATGTTCAGAGCAGCGGAATGATGTTGCCAGTAGCAGACCAGCTACCT 
CGAGGAGGACC^TCGATGCTACCATCCCTCGGGC^CAGCCGATATTGTCAAGC^GCGTT 
TCAAGAAGAAGCGATCTCACTGGTGCGCTGGCGGTTAGAAACAGTATCCCCGAGACCAAC 
AGCAGAGTGTTACCAACTACTCACTCGGTCTTCAATAACTTCCCCGCGGATCTACCTCGC 
AG<^GCTCCCCGTTGGCAAGTGCCCC^GGGATTT^ 

GAAGAGGTCAACAGCTCGGATGCAAAAGGAGGTTCATCAGCTGCTACTGCTGGATTTGGT 
AACCCAAGCTACGACATATTTAACGATTTC 

AGCAATAAACTAAACGATTGGGATCTGCGGAATATGGGATTGGTCTTCAGTTCCAATCAG 
GACGCAGCAACTGCAACCGCAACCGCAGCATTTTCCACTTCGGAAGCATACTCTTCGTCT 
TCTACGC^GAGAAAAAGACGGGAAACGGACGC^CAGTTGTGGGTGAGCATGGGC^GAAC 
CTGCAGTCACCGAGCCGGAATCTGTATCATCTGAACCACGTTTTTATGGACGGTGGTTCA 
GTCAGAGTGAAGTCAGAAAGAGTGGCGGAGACAGTGACTTGTCCTCCAGCAAATACATTG 
TTTCACGAGC^GTATAATCAAGAAGATCTGATGAGCGCATTTCTCAAACAGGTTTGATTA 
TTACTCGAATACAGTGCACTCTAAAAC 

>G727 Amino Acid Sequence (domain in AA coordinates: 226-269) 

MWPGHGRGPDSGTAAGGSNSDPFPAIHjRVLVVDDDPTCLMILBRMLMTCLYREQRAHCL 

CFGRTKNGFDIVISDVHMPDI^GFKLLEHVGLEMDLPVINLNVLKPLVIVMSADDSKSW 

LKGVTHGAVDYLI KPVRI EALKNI WQHVVRKKRNEWNVSEHSGGS IEDTGGDRDRQQQHR 

EDADNNSSSVNEGNGRSSRKRKEEEVDDQGDDKEDSSSLKKPRVWSVELHQQFVAAW^ 

LGVDSELKTCLLMHLCVSIGNIVEFQKYRIYLRRLGGVSQHQGNMNHSFMTGQDQSFGPL 

S SLNGFDLQSLAVTGQLPPQSLAQLQAAGLGRPTLAKPGMS VS PLVDQRS I FNFENPKIR 

FGDGHGQTMl^GNLLHGVPTGSHMRLRPGQNVQSSGI^LPVADQLPRGGPSMLPSLGQQP 

ILSSSVSRRSDLTGALAVRNSIPETNSRVLPTTHSVFNNFPADLPRSSFPLASAPGISVP 

VSVSYQEEVNSSDAKGGSSAATAGFGNPSYDIFNDFPQHQQHNKNI^ 

VF S SNQDAATATATAAFS TSEAYS S S STQRKRRETDATWGEHGQNLQS P SRNLYHLNHV 

FMDGGSVRVKSERVAETVTCPPANTLFHEQYNQEDLMSAFLKQV* 

>G740 (25.. 924) 

CTTCTTCAACTTTTTTTTTTAACGATGGCTTCAGAGGATCAATCGGCGGCGAGATCTACC 
GGGAAGGTGAACTGGTTCAACGCTTCTAAAGGCTATGGTTTCATTACTCCTGACGATGGC 
AGCGTAGAGCTTTTeGTTCATCAATCTTCAATTGTCTCCGAAGGTTACCGGAGTTTAACC 
GTCGGCGATGCGGTTGAGTTCGCTATTACTCAGGGAAGCGACGGTAAGACTAAAGCCGTC 
AATGTTACTGCTCCTGGTGGTGGTTCTCTCAAGAAGGAGAATAACTCTCGTGGTAACGGT 
GCTAGGCGCGGCGGCGGTGGAAGCGGTTGCTACAATTGCGGTGAGTTAGGTCATATCTCT 
AAAGATTGTGGTATTGGTGGCGGCGGCGGAGGTGGTGAACGTAGATCTAGAGGAGGAGAA 
GGTTGTTACAATTGTGGTGATACTGGTCACTTCGCTAGGGATTGTACTTCAGCTGGAAAC 
GGTGACCAACGTGGAGCCACCAAAGGTGGAAACGATGGTTGCTACACTTGCGGTGATGTT 
GGTCACGTGGCTAGGGATTGTACTCAGAAATCAGTTGGAAACGGAGACCAACGTGGAGCG 
GTCAAAGGTGGAAACGATGGTTGCTACACTTGTGGTGATGTTGGTCACTTTGCTAGGGAT 
TGTACTCAGAAGGTTGCTGCCGGAAACGTCAGAAGCGGTGGTGGTGGTAGTGGAACTTGT 
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TATTCATGCGGTGGAGTTGGTCACATTGC^GAGATTGTGCGACTAAGAGACAGCCTTCT 

CGTGGGTGTTACC^GTGTGGTGGTTCTGGTCACTTGGCTCGTGATTGTGACCAGAGAGGA 

AGCGGTGGAGGAGGTAATGATAATGCGTGCTACAAGTGTGGTAAGGAAGGTCACTTTGCA 

AGGGAATGTTCTTCTGTAGCTTAATCGATTTCCTAATCAACAAAACAAAAA^ 

GAAATTGAATCGAGTTATATAGTTTGGTATATATTACTCTTCGTTTTCATTTATCTTTTT 

TTTTGTTGTTGATGGGAATGAAATTGCCTGGTCCTTTTGGTGTGTTTTTGAGC 

ATTATACAGAGTGATCCCTTTTTTGTTATAACTACT 

TGGATGCTCTCTCCTTTTCTTCTATCTGTra 

ATGTCATCCAAA 

>G740 Amino Acid Sequence (domain in AA coordinates: 24-42, 232-268) 

I^SEDQSAARSTGKVNWFNASKGYGFITPDDGSVELFVHQSSIVSEGYRSLTVGDAVEFA 

ITQGSDGKTOAVNVTAPGGGSLK3CENNSRGNGARRGGGGSGCYNCGELGHISKDCGIGGG 

GGGGERRSRGGEGCYNCGDTGHFAIUDCTSAGNGDQRGATKGGNDGCYTCGDVGHVARDCT 

QKSVGNGDQRGAVKGGNDGCYTCGDVGHFARDCTQKVAAGNVRSGGGGS GTCYS CGGVGH 

IARDCATKRQPSRGCYQCGGSGHLARDCDQRGSGGGGNDNACYKCGKEGHFARECSSVA* 

>G770 (119.. 1069) 

CCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTGACAAGCTGACTCT 
AGCAGATCTGGTACCGTCGACX3GTTCTTGGATTTGGAGTAAACTAAAGATCATATAAAAT 
GGAACAAGG AGATCATC AGCAGCATAAGAAAGAAGAAGAAG CTTTGCCAC CGGGTTTCAG 
ATTTCATCCGACGGATGAGGAGCTAATCTCATATTACTTGGTTAATAAGATTGCCGATCA 
AAACTTCACCGGGAAAGCAATCGCTGACGTTGATCTTAACAAGTCCGAGCCATGGGAGCT 
TCCTGAGAAGGCGAAAATGGGAGGAAAAGAATGGTACTTTTTTAGCCTCCGGGACCGGAA 
GTACCCGACGGGAGTGAGGACGAATAGGGCGACGAATACAGGATATTGGAAAACCACAGG 
AAAAGACAAAGAGATATTCAATAGCACAACCTCGGAGTTGGTCGGGATGAAGAAGACTTT 
GGTCTTTTACAGAGGACGAGCTCCTCGTGGGGAGAAGACTTGTTGGGTCATGCATGAGTA 
TCGACTTCACTCCAAGTCCTCATATAGAACCTCCAAGCAAGACGAGTGGGTAGTGTGTAG 
AGTGTTCAAGAAAACAGAAGCAACC AAGAAATACATAAGC AC CAGTAG C AG CAGCACAAG 
TCATCACCACAACAAC CACACAAGAGCCTCAATACTATCAAC CAACAACAATAATC CTAA 
TTACTCATCAGACCTCCTTCAACTCCCACCGCATCTACAACCACACCCGAGCCTCAATAT 
TAACCAATCCCTCATGGCAAACGCCGTTCACCTAGCTGAGCTCTCAAGAGTCTTCCGTGC 
CTCTACAAGCACCACCATGGACTCTTCTCATCAGCAGCTAATGAACTACACCCACATGCC 
TGTCTCAGGGCTCAACCTCAACCTTGGCGGTGCACTGGTCCAGCCGCCTCCTGTTGTGTC 
TCTTGAGGATGTTGCCGCGGTTAGTGCTTCGTACAATGGCGAAAACGGGTTTGGAAATGT 
GGAGATGAGCCAGTGCATGGACTTGGATGGATACTGGCCATCTTATTGATTGGTAATTGT 
CAGTTTAAGTTATGGTTTTTATATTGTTTCCATTTACTTGTTGGTAAAACGATTTTGGTT 
GTTCTTGCGAACGCTCTAGACAGGCCTCGTACCGGATCCTCTAGCTAGAGCTTTCGTTCG 
TATCATCGGTTTC 

>G770 Amino Acid Sequence (domain in AA coordinates: 19-162) 
MEQGDHQQHKKEEEALPPGFRFHPTDEELISYYLWKIADQNFTGKAIADVDI^KSEPWE 
LPEKAKMGGKE WYF FS LRDRKYPTGVRTNRATNTGYWKTTGKDKE I FNSTTSELVGMKKT 
LVFYRGRAPRGEKTCWVMHEYRIJJSKSSYRTSKQDEI^ 
SHHHNWHTRASILSTNNWPOT 

ASTSTTMDSSHQQLMNYTHMPVSGLNLNLGGALVQPPPWSLEDVAAV 

VEMSQCMDLDGYWPSY* 

>G858 (99.. 869) 

CA.CAGAGCCCAGGTTGATTGATTTTGTTATTCAGAGATATGGGGAGAGGAAGGATTGAGA 
TTAAGAAGATTGAGAATATCAACAGTCGTCAAGTCACTTTCTCTAAGAGACGAAACGGTT 
TGATCAAGAAGGCTAAAGAGCTTTCGATTCTCTGTGACGCCGAGGTTGCTCTTATCATCT 
TCTCCAGCACCGGCAAGATTTACGATTTCTCCAGCGTCTGTATGGAGCAAATTCTTTCTA 
GATATGGATACACTACTGCGTCCACTGAGCATAAACAACAAAGAGAACACCAACTTCTAA 
TTTGTGCTTCACATGGAAATGAAGCTGTGTTGCGAAATGATGATTCTATGAAGGGGGAAC 
TTGAAAGATTACAGCTTGCAATTGAGAGACTTAAGGGTAAGGAGCTTGAAGGTATGAGTT 
TCCCGGATCTTATTTCTCTTGAAAACCAGTTGAACGAGAGCTTGCATAGTGTCAAGGATC 
AAAAGACACAAATCCTGCTCAACCAGATTGAGAGATCCAGGATACAGGAGAAAAAAGCAT 
TGGAAGAAAACCAAATCTTGCGCAAACAGGTTGAGATGTTGGGGAGAGGTTCAGGACCAA 
AAGTGTTGAATGAAAGGCCTCAAGATTCTAGCCCAGAAGCCGATCCCGAGAGCTCTTCAT 
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CAGAAGAGGATGAGAATGACAACGAGGAGCACCATTCCGACACTTCCTTGCAGTTGGGGT 

TGTCGTCGACGGGGTATTGCACAAAGAGAAAGAAGCCGAAGATCGAACTGGTCTGCGATA 

ACTCTGGGAGTCAAGTGGCTTCTGATTGATGGAATCGATTATTTTTCTAATTCTGGTTGT 

TTAGGGGTCTCTATGTGTCTTCTTGTTTCTGGCTGTTCTTTTGCTTTATTT 

TAGAGTTTTCTTAATGTTTAGGTGGAACATTT 

TCAATAACATTAGATTTTCTTAGTTAAAGACTTAAAGTTGCCCACACACCACACCATATG 

TGATTATGATGAATTTACATTTTATAAAAAAAAAAAAAAAAAAAAAAAAA 

>G858 Amino Acid Sequence (domain in AA coordinates: 2-57) 

mgrgrieikkieninsrqvtfskriultglikkakelsilcdaevaliifsstgkiydfssv 
^eqilsrygyttastehkqqrehqlligashgne^ 

kelegmsfpdlislenqlneslhsvkdqktqillnqiersriqekkaleenqilrkqvem 

LGRGSGPKVLNERPQDSSPEADPESSSSEEDENDNEEHHSDTSLQLGLSSTGYCTKRKKP 

KIELVCDNSGSQVASD* 

>G865 (282.. 920) 

ATCCCCACTTGTTGTTCATCACCAAGCCAAGCTCCATGTCCTAGTCACTCCACAGATTCC 

CTATCATCATCAATTCGTTTCAAACTTAGTTCCTTTCAAAGTCTTGTACATATATACACA 

CACACCTATTATTCTCTTGGTGTGTTTGTGTGTTACATATACGTGTGAGTAC^TACTTTG 

TTGTAAAAGTGGATCGGAGGTATGGAAAGGGACCGGTTCCACCGGAAACATCGGCGGCGG 

CGGATGATAATTCGTCTTGGAACGAGACTGATGTCACCGCCATGGTCTCCGCTCTCAGCC 

GTGTCATAGAGAATCCGACAGACCCGCCGGTCAAACAAGAGCTTGATAAATCGGATCAAC 

ATCAACCAGACCAAGATCAACCAAGAAGAAGACACTATAGAGGCGTAAGGCAGAGACCAT 

GGGGTAAATGGGCGGCAGAAATCCGCGATCCAAAGAAAGCAGCCCGTGTCTGGCTCGGGA 

CTTTCGAGACGGCAGAGGAAGCTGCTTTAGCCTATGACCGAGCTGCCCTCAAATTCAAAG 

GCACCAAGGCTAAACTGAACTTCCCTGAACGGGTCCAAGGCCCTACTACCACCACAACCA 

TTTCTCATGCACCAAGAGGAGTTAGTGAATCCATGAACTCACCTCCTCCTCGACCTGGTC 

CACCTT(^^CTACTACTACTTCGTGGCCAATGACTTATAACCAGGA(^TACTTCAATACG 

CTCAGTTGCTTACGAGTAACAATGAGGTTGATTTATCATACTACACGTCGACTCTCTTCA 

GTCAACCTTTTTCAACGCCTTCTTCATCTTCTTCTTCCTCCCAACAGACGCAGCAACAGC 

AGCTACAACAACAACAACAGCAGCGTGAAGAAGAAGAGAAGAATTATGGTTACAATTATT 

ATAACTACCC^GAGAATAATCTAATTATTATTGTTGGTCGAATCAGTTTTATAAATAGC 

TATCATAGTTTCATTTTTGGTTTCCGTAACCTTTGTTGCATGGAAAATATGAATGAACGA 

GGGACATGTGTAACAATTTGTTTGTGTTTCGTAAATGTTAGTTGTATTTGGATTTGCTGA 

AGTTTGATTTTCTGAGCATAAATCATTTGACGGTCAAAAAAAAAAA 

>G865 Amino Acid Sequence (domain in AA coordinates: 36-103) 

MVSALSRVIENPTDPPVKQELDKSDQHQPDQDQPRRRHYRGTOQRPWGKWAAEIRDPKKA 

ARVWLGTFETAEEAALAYDRAALKFKGTKAKLNF PERVQGPTTTTTI SHAPRGVSE SMNS 

PPPRPGPPSTTTTSWPMTYNQDILQYAQLLTSNNEVDLSYYTSTLFSQPFSTPSSSSSSS 

QQTQQQQLQQQQQQREEEEKNYGYNYYNYPRE * 

>G872 (59.. 646) 

CCGGAAACAGAATCCAATTCAACCAAACCGAATCGAACCGAACCGGAGTTTTTATCCAAT 
GGTGAAGCAAGCGATGAAGGAAGAGGAGAAGAAGAGAAACACGGCGATGCAGTCAAAGTA 
CAAAGGAGTGAGGAAGAGGAAATGGGGAAAATGGGTATCGGAGATCAGACTTCCACACAG 
CAGAGAACGAATTTGGTTAGGCTCTTACGACACTCCCGAGAAGGCGGCGCGTGCTTTCGA 
CGCCGCTCAATTTTGTCTCCGCGGCGGCGATGCTAATTTCAATTTCCCTAATAATCCACC 
GTCGATCTCCGTAGAAAAGTCGTTGACGCCTCCGGAGATTCAGGAAGCTGCTGCTAGATT 
CGCTAACACATTCCAAGACATTGTCAAGGGAGAAGAAGAATCGGGTTTAGTACCCGGATC 
CGAGATCCGACCAGAGTCTCCTTCTACATCTGCATCTGTTGCTACATCGACGGTGGATTA 
TGATTTTTCGTTTTTGGATTTGCTTCCGATGAATTTCGGGTTTGATTCCTTCTCCGACGA 
CTTCTCTGGCTTCTCCGGTGGTGATCGATTTACAGAGATTTTACCCATCGAAGATTACGG 
AGGAGAGAGTTTATTAGATGAATCTTTGATTCTTTGGGATTTTTGAATTCCCAAACATAA 

TTGTAACACGGAGCTCCAATGACCCGGGAATTTCTTTCGTTTCGGATCCGAATTTGATGT 

GGATCATATTCACACCTATATTTTTTCATTTTTTTGTTGTAAAGAAAAATCGGATAAGAT 

TCTAGTAATAAATGTTAAAAGTCCATTTCATTAAAAAAAAAAAAAAAAAAA 

>G872 Amino Acid Sequence (domain in AA coordinates: 18-85) 

MVKQAMKEEEKKRNTAMQSKYKGWK3tfCW^ 

DAAQFCLRGGDANFNFPNNPPSISVEKSLTPPEIQEAAARFANTFQDIVKGEEESGLVPG 
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SEIRPESPSTSASVATSTVDYDFSFLDLLPMNFGFDSFSDDFSGFSGGDRFTEILPIEDY 

GGESLLDESLILWDF* 

>G904 (1..1005) 

atggaatctctcatcaatcccagccatggcggaggaaactacgattctcactcttcttct 

ctcgatagtctcaaaccaagcgtactagtcatcattctcattctcctcatgactcttctc 

atctccgtttccatttgcttcctcctccgctgtctcaatcgctgtagccaccgctccgtt 

ctccctctttcatcttcctcttccgtcgcaaccgtaacttccgattcccgacgattctct 

ggacatcgagtctctcccgaaacagaacggtcctccgtgcttgattcgcttccgattttc 

aaattctcctccgtcactcgccgatctagctccatgaattccggagattgcgccgtttgt 

ttgtcgaaattcgaaccggaggatcagctccgtcttcttcctctctgttgtcacgctttt 

cacgccgattgtatcgatatctggctagtctctaaccagacttgtcctctctgtcgctct 

cctctcttcgcttcagaatctgatctcatgaagtctctcgccgtcgtcggctcaaacaac 

ggcggaggagaaaacagcttccgtctcgaaatcggatccatcagccgtcgtcgtcaaaca 

ccgattccagaatccgttgagcagcatcgaacttactcaatcggttcgttcgattacata 

gtagacgacgtagattcagaaatctcagagtcaaatttcaaccgtggaaaacaggaagac 

gcgactacaacaactgccacagcaacggcggttacgactaatccgacgtcgtttgaagct 

agtttagcggcggatataggtaacgatggttctagaagctggctcaaggattacgttgac 

agactctcacgaggtatatcgtcgcgtgcaatgtcgtttagaagctctggtagatttttt 

actgggagtagtcgtcggagtgaggaattgacggtgatggatttagaagcgaatcatgcc 

ggagaagagataagtgagcttttccggtggctctcaggggtgtga 

>G904 Amino Acid Sequence (domain in AA coordinates: 117-158) 

MESLINPSHGGGNYDSHSSSLDSLKPSVLVIILILLMTLLISVSICFLLRCLNRCSHRSV 

LPLSSSSSVAWTSDSRRFSGHRVSPETERSSVLDSLPIFKFSSVTRRSSSMNSGDCAVC 

LSKFEPEDQIJILLPLCCHAFHADCIDIWLVSNQTCPLCRSPLFASESDLMKSLAVVGSNN 

GGGENSFRLEIGSISRRRQTPIPESVEQHRTYSIGSFDYIVDDVDSEISESNFNRGKQED 

ATTTTATATAVTTNPTSFEASLAADIGNIX3SRSWLKDYVDRLSRGIS 

TGS SRRS EELTVMDLE ANHAGEE I S ELFRWLSGV* 

>G910 (1..1071) 

ATGTTATGTATAATAATAATTGAGAATATGGAAAGAGTATGTGAGTTTTGTAAAGCGTAT 
AGAGCAGTGGTTTATTGTATAGCTGATACAGCAAATCTTTGTTTAACATGTGATGCAAAG 
GTTCATTCAGCTAATTCACTCTCGGGACGGCATTTACGTACGGTTTTATGTGATTCTGGT 
AAGAATCAGCCTTGTGTTGTCCGATGTTTTGACCATAAAATGTTTCTTTGCCATGGATGT 
AATGATAAGTTTCATGGTGGTGGCTCTTCTGAGCATCGTAGAAGGGATTTGAGGTGTTAT 
ACGGGTTGTCCTCCTGCTAAAGATTTCGCGGTTATGTGGGGTTTTCGAGTTATGGATGAC 
GATGATGATGTTTCGTTAGAGCAATCTTTTCGAATGGTTAAACCTAAGGTGCAAAGAGAA 
GGTGGTTTTATCTTGGAACAGATTCTTGAATTGGAGAAGGTTCAGCTCAGGGAAGAGAAT 
GGTAGTTCTTCCTTGACAGAACGAGGTGATCCATCTCCATTGGAGCTTCCTAAGAAACCC 
GAAGAACAGTTAATCGATCTTCCGCAGACCGGAAAAGAGCTGGTTGTTGATTTTTCACAC 
TTGTCCTGATCTTCCACACTTGGTGATTC 

AACAATCAGTTGTGGCATCAAAATATACAAGACATTGGAGTATGTGAAGATACAATCTGC 

AGTGACGATGACTTCCAAATACCTGACATTGATCTCACTTTCCGGAACTTTGAAGAGCAA 

TTTGGAGCTGATCCTGAGCCAATTGCAGATAGTAACAACGTGTTCTTTGTTTCTTCCCTT 

GACAAATCACATGAGATGAAGACATTTTCTTCTTCATTCAATAATCCCATATTTGCACCT 

AAACCAGCTTCATCAACTATCTCATTCTCAAGCAGTGAAACCGATAACCCTTATAGTCAC 

TCAGAGGAAGTAATCTCATTTTGTCCCTCCCTCTCTAACAATACACGTCAAAAGGTCATC 

ACAAGGCTCAAGGAGAAGAAGAGAGCAAGAGTGGAGGAGAAAAAAGCTTAA 

>G910 Amino Acid Sequence (domain in AA coordinates : 14-37, 77-103) 

MLCI 1 1 IENMERVCEFCKAYRAVVYCIADTANLCLTCDAKVHSANSLSGRHLRTVLCDSG 

KNQPCVVRCFDHKMFLCHGCNDKFHGGGSSEHRRRDLRCYTGCPPAKDFAV^ 

DDDVSLEQSFRMVKPKVQREGGFILEQILELEKVQLREENGSSSLTERGDPSPLELPKKP 

EEQLIDLPQTGKELVVDFSHLSSSSTLGDSFWECKSPYNKNNQLWHQNIQDIGVCEDTIC 

SDDDFQ I PD I DLTFRNFEEQFGADPEPI ADSNNVFFVS SLDKSHEMKTFS S S FNNP I FAP 

KPASSTISFSSSETDNPYSHSEEVISFCPSLSimTRQKVITRLKEKKRARVEEKKA* 

>G912 (20.. 694) 

AATCTCCGATCATAGATCTCCGGTTTCAGACAGTAGTGAGTGTTCACCAAAGTTAGCTTC 
AAGTTGTCCAAAGAAACGAGCTGGGAGGAAGAAGTTTCGTGAGACACGTCATCCGATTTA 
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CAGAGGAGTTCGTCAGAGGAATTCTGGTAAATGGGTTTGTGAAGTTAGAGAGCCTAATAA 
GAAATCTAGGATTTGGTTAGGTAOTTTTCCGACGGTTGAAATGGCTGCTCGTGCTCATGA 
TGTTGCTGCTTTAGCTCTTCGTGGTCGCTCTGCTTGTCTCAATTTCGCTGATTCTGCTTG 
GCGGCTTCGTATTCCTGAGACTACTTGTCCTAAGGAGATTCAGAAAGCTGCGTCTGAAGC 
TGCAATGGCGTTTCAGAATGAGACTACGACGGAGGGATCTAAAACTGCGGCGGAGGCAGA 
GGAGGCGGCAGGGGAGGGGGTGAGGGAGGGGGAGAGGAGGGCGGAGGAGCAGAATGGTGG 
TGTGTTTTATATGGATGATGAGGCGCTTTTGGGGATGCCCAACTTTTTTGAGAATATGGC 
GGAGGGGATGCTTTTGCCGCCGCCGGAAGTTGGCTGGAATCATAACGACTTTGACGGAGT 
GGGTGACGTGTCACTCTGGAGTTTTGACGAGTAATTTTTTGGCTCTTTTTCTGGATAATA 

AGTT 

>G912 Amino Acid Sequence (domain in AA coordinates : 51-118) 

MNPFYSTFPDSFLSISDHRSPVSDSSECSPKLASSCPKKRAGRKK^RETRHPIYRGVRQR 

NSGKWVCEVREPNKKSRIWLGTFPTVEMAARAHDVAALALRGRSACI^ 

TTCPKEIQKAASEAAMAFQNETTTEGSKTAAEAEEAAGEGVREGERRAEEQNGGVFYT*©D 

EALLGMPNFFENMAEGMLLPPPEVGWNHNDFDGVGDVSIjWSFDE* 

>G920 (114.. 1154) 

AAAAAATCTATTTTCTTCTCT^TCCACTATATTACAACATTTCTTCATTCTCAAATCATC 
ATACTAAAAACCTAAAAAAAGTTACATATTCATTGTATCTTTGTGAGAAAAAAATGGATT 
CGAATAGTAAO^CACGAAATCCATAAAGAGAAAAGTTGTCGACCAACTTGTCGAAGGCT 
ATGAATTCGCTACTCAGCTTCAGCTTCTCCTTTCTCATCAACACTCTAACCAGTACCACA 
TCGATGAGACCCGTCTTGTTTCCGGGTCGGGTTCAGTTTCCGGTGGTCCAGATCCCGTTG 
ATGAGCTCATGTCTAAGATCTTGGGATCTTTCCATAAAACTATATCGGTTCTTGATTCTT 
TTGATCCCGTCGCCGTCTCTGTCCCCATCGCCGTCGAGGGTTCATGGAATGCTTCATGTG 
GGGATGATTCGGCGACTCCGGTGAGTTGCAACGGTGGAGATTCCGGTGAGAGTAAGAAGA 
AGAGATTAGGGGTTGGTAAGGGTAAAAGAGGATGCTACACTAGAAAGACGAGATCACATA 
CAAGGATCGT^AAGCTAAAAGTTCTGAAGACAGATATGCTTGGAGGAAATATGGACAAA 
AGGAGATTCTTAATACCACATTCCCAAGAAGTTACTT^ 

AAGGATGCAAAGCAACAAAGCAAGTTCAGAAACAGGATCAAGATTCTGAGATGTTCCAAA 
TCACATACATTGGCTACCACACATGCACTGCCAATGACCAAACGCACGCGAAGACCGAGC 
CTTTTGATCAAGAAATCATTATGGATTCGGAAAAGACATTGGCTGCTAGCACTGCTCAGA 
ACCATGTCAATGCTATGGTGCAAGAGCAAGAGAACAACACCAGCAGTGTGACAGCAATAG 
ACGCAGGCATGGTTAAGGAGGAACAAAATAACAATGGTGATCAGAGTAAAGATTATTATG 
AGGGCTCTTCGACAGGTGAGGACTTGTC^TTGGTTTGGCAAGAGACGATGATGTTTGATG 
ATCATCAAAATCACTACTATTGTGGTGAAACCAGTACTACTTCTCATCAATTTGGTTTCA 
TCGACAACGATGATCAGTTTTCCTCCTTCTTCGACTCATATTGTGCTGATTATGAAAGAA 
CAAGTGCTATGTGAACATCCAAATCTGGAATGATGAATCAGCACTAGGTCTTCTCTTTGA 
GTATGTCTAGTTTAATGTAATATTTTTGTTGTATGTTTGATAAAAACACCATATATACTT 
CTCTTTTTACACCAAAAAAAAAAAAAAAAAAAAAAA 

>G920 Amino Acid Sequence (domain in AA coordinates: 152-211) 

^SNSNlNTTKSIKRKVVDQLVEGYEFATQLQLLLSHQHSNQYHIDETRLVSGSGSVSGGPD 

PVDELMSKILGSFHKTISVLDSFDPVAVSVPIAVEGSWNASCGDDSATPVSCNGGDSGES 

KKKRLGVGKGKRGCYTRKTRSHTRIVEAKSSEDRYAWRKyGQKEILNTTFPRSYFRCTHK 

PTQGCKATKQVQKQDQDSEMFQITYIGYHTCTANDQTHAKTEPFDQEIIMDSEKTLAAST 

AQNHVNAMVQEQENNTSSVTAIDAGM^^ 

FDDHQNHYYCGETSTTSHQFGFIDNDDQFSSFFDSYCADYERTSAM* 
>G939 (9. .1565) 

CAGATTCTATGGATATGTATAACAACAATATAGGGATGTTCCGGAGTTTAGTTTGTAGCT 
CGGCGCCTCCATTTACAGAGGGACATATGTGTTCTGATTCGCATACGGCTTTGTGCGATG 
ATCTGAGTAGTGATGAGGAAATGGAAATAGAGGAGCTTGAGAAGAAGATCTGGAGAGACA 
AGCAGCGTTTAAAGCGGCTCAAGGAAATGGCGAAGAACGGTCTAGGAACAAGATTGTTGT 
TGAAGCAGCAACATGATGATTTTCCAGAGCACTCTAGTAAGAGAACCATGTACAAGGCAC 
AAGATGGGATCTTGAAGTACATGTCGAAGACAATGGAGCGATATAAAGCTCAAGGTTTTG 
TTTATGGGATTGTGTTAGAGAATGGGAAAACGGTAGCGGGATCTTCTGATAATCTCCGTG 
AATGGTGGAAAGACAAAGTGAGGTTTGATAGGAACGGCCCAGCTGCTATAATCAAGCACC 
AAAGGGATATCAATCTTTCTGATGGAAGTGATTCAGGGTCTGAGGTTGGGGATTCTACCG 
CACAGAAGTTGCTTGAGCTTCAAGATACTACTCTTGGAGCTCTGTTATCGGCTCTGTTTC 
CTCACTGCAACCCTCCTCAGAGGCGGTTTCCGTTGGAGAAAGGCGTGACACCGCCATGGT 
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GGCCAACGGGGAAAGAAGATTGGTGGGATCAACTGTCTTTACCCGTTGATTTTCGAGGTG 
TTCCGCCACCTTACAAGAAGCCTCATGATCTCAAGAAGCTGTGGAAAATTGGTGTTTTGA 
TTGGTGTAATCAGACATATGGCTTCTGACATTAGCAACATACCCAATCTCGTGAGACGGT 
CTAGAAGTTTGCAGGAGAAAATGACGTCAAGAGAAGGCGCTTTATGGCTCGCTGCTCTTT 
ACCGAGAAAAGGCTATTGTTGATCAAATAGCCATGTCTAGAGAAAAC^CAACACTTCTA 
ACTTTCTTGTTCCTGCAACCGGTGGAGACCCAGATGTTTTGTTTCCTGAATCTACAGACT 
ATGATGTTGAACTGATTGGTGGCACTCATCGGACCAATCAGCAGTATCCTGAATTTGAAA 
ACAACTACAACTGTGTTTACAAGAGAT^GTTTGAAGAAGATTTTGGGATGCCAATGCATC 
CAAC^CTCCTAACATGTGAGAACAGTCTCTGTCCTTATAGCCAACCACATATGGGATTTC 
TTGACAGGAACTTAAGAGAGAATCACC^^TGACTTGTCCTTATAAAGTCACTTCCTTCT 
ACCAACCAACTAAACCCTATGGTATGACGGGTTTAATGGTTC 

GGATGCAGCAGCAGGTTCAGAGCTTTCAAGACCAGTTTAATCATCCCAACGATCTCTACA 

GACCAAAAGCTCCACAAAGAGGCAACGATGACTTGGTTGAGGATTTGAATCCTTCTCCTT 

CGACGCTGAATCAGAATCTTGGTTTAGTCTTACCTACTGACTTCAATGGAGGTGAGGAAA 

CAGTAGGAACAGAGAACAATCTGCATAATCAAGGGCAAGAGTTGCCCACATCTTGGATTC 

AGTAAAGAAAGCTTCAGAGTTTTCTTTTTATGTTTTCTAGTCTTTATAGCTTTGTCTCTT 

GCTTATTCTCTCATTAAACACAGTTTTTGATCTCTCCATTTCATAGCCCATGTAGCAATG 

GAGAAGATTAGGTTTCATAATAAGTTAATAACGAAATTCAAA - 

>G939 Amino Acid Sequence (domain in AA coordinates: 97-106) 

MDMYl^IGMFRSLVCSSAPPFTEGHMCSDS 

LKRLKEMAKNGLGTRLLLKQQHD 

IVLENGKTVAGSSDNLRBWWKDKVRFDRNGPAAIIKHQRDINLSDGSDSGSEVGDSTAQK 
LLELQDTTLGALDSALFPHCNPPQRRFPLEKGVTPPWWPTGKEDWWDQLSLPVDFRGVPP 
PYKKPHDLKKLWKIGVLIGVIRHMASDI SNI PNLVRRSRSLQEKMTSREGALWLAALYRE 
KAIVDQIAMSRENNNTSNFLVPATGGDPDVLFPESTD^ 

NCVYKRKFEEDFC^PMHPTLLTCENSIiCPYSQPHMGFLDRNLRENHQMTCPYKVTSFYQP 
TKPYGMTGLI^CPDYNGMQQQVQSFQDQFNHPNDLYRPKAPQRGroDLVEDLNPSPS^ 
NQNLGLVLPTDFNGGEETVGTENNLHNQGQELPTSWIQ* 
>G963 (1..897) 

ATGAGTTTGCCTCCAGGATTCAGGTTTCATCCCACTGATGAAGAACTGGTGGCTTACTAT 
CTTGATAGGAAGGTCAACGGCCAAGCCATTGAGCTCGAGATCATCCCAGAAGTTGATCTT 
TATAAATGCGAGCCATGGGACTTGCCTGAAAAGTCATTTTTGCCGGGAAACGACATGGAA 
TGGTACTTTTACAGCACAAGGGATAAGAAGTATCCAAATGGCTCTAGGACGAACCGTGCG 
ACCCGAGCGGGTTACTGGAAGGCCACGGGGAAAGATCGTACAGTAGAATCAAAGAAGATG 
AAGATGGGAATGAAGAAGACACTGGTTTATTATAGAGGAAGGGCTCCTCATGGCCTTCGT 
ACTAATTGGGTCATGCATGAATATCGTCTCACGCACGCTCCTTCCTCCTCCTTGAAGGAG 
TCGTATGCATTGTGCCGAGTGTTTAAGAAGAACATACAAATTCCAAAGAGAAAAGGGGAA 
GAAGAAGAAGCAGAAGAAGAGAGCACTAGTGTAGGAAAAGAAGAGGAAGAAGAAAAGGAG 
AAGAAGTGGAGAAAATGTGATGGTAATTATATTGAAGACGAGAGCTTGAAAAGAGCATCC 
GCGGAGACATCTTCATCAGAGCTAACTCAAGGGGTCCTTTTAGACGAAGCAAACAGCTCA 
TCCATATTTGCTCTTCATTTCTCATCTTCTCTTCTGGACGATCATGATCATCTTTTCTCA 
AACTATTCTCATCAGCTTCCATATCATCCTCCTCTTCAACTCCAAGATTTCCCTCAACTT 
TCTATGAACGAAGCAGAGATTATGTCAATCCAACAAGACTTTCAATGCAGAGACTCTATG 
AACGGGACACTTGACGAAATCTTCTCTTCTTCCGCCACTTTCCCCGCTTCCCTTTGA 

>G963 Amino Acid Sequence (domain in AA coordinates: TBD) 
MSLPPGFRFHPTDEELVAYYLDRKVWGQAIELEIIPEVDLYKCEPWDLPEKSFLPGNDME 

WYFYSTRDKKYPNGSRTNRATRAGYWKATGTO 

TNWVMHEYRLTHAMSSLKESYALCRVFKKNIQIPKRKGEEEEAEEESTSVGKEEEEE 
KKWIiKCDGNYIEDESLKRASAETSSSELTQGVLLDEANSSSIFALHFSSSLLDDHDHLFS 
NYSHQLPYHPPLQLQDFPQLSMNEAEIMSIQQDFQCRDSMNGTLDEIFSSSATFPASL* 
>G979 (60.. 1352) 

CCTCTGAGGAATCAAATCACTC^CACTCCAAAAAAAAATCTAAACTTTCTCAGAGTTTAA 
TGAAGAAGCGCTTAACCACTTCCACTTGTTCTTCTTCTCCATCTTCCTCTGTTTCTTCTT 
CTACTACTACTTCCTCTCCTATTCAGTCGGAGGCTCCAAGGCCTAAACGAGCCAAAAGGG 
CTAAGAAATCTTCTCCTTCTGGTGATAAATCTCATAACCCGACAAGCCCTGCTTCTACCC 
GACGCAGCTCTATCTACAGAGGAGTCACTAGACATAGATGGACTGGGAGATTCGAGGCTC 
ATCTTTGGGACAAAAGCTCTTGGAATTCGATTCAG 
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TGGGAGCATATGACAGTGAAGAAGCAGCAGCACATACGTACGATCTGGCTGCTCTCAAGT 
ACTGGGGACCCGACACCATCTTGAATTTTCCGGCAGAGACGTACACAAAGGAATTGGAAG 
AAATGCAGAGAGTGACAAAGGAAGAATATTTGGCTTCTCTCCGCCGCCAGAGCAGTGGTT 
TCTCCAGAGGCGTCTCTAAATATCGCGGCGTCGCTAGGCATCACCACAACGGAAGATGGG 
AGGCTCGGATCGGAAGAGTGTTTGGGAACAAGTACTTGTACCTCGGCACCTATAATACGC 
AGGAGGAAGCTGCTGCAGCATATGAC^TGGCTGCGATTGAGTATCGAGGCGCAAACGCGG 
TTACTAATTTCGACATTAGTAATTACATTGACCGGTTAAAGAAGAAAGGTGTTTTCCCGT 
TCCCTGTGAACCAAGCTAACCATCAAGAGGGTATTCTTGTTGAAGCCAAACAAGAAGTTG 
AAACGAGAGAAGCGAAGGAAGAGCCTAGAGAAGAAGTGAAACAACAGTACGTGGAAGAAC 
CACCGC^AGAAGAAGAAGAGAAGGAAGAAGAGAAAGCAGAGCAACAAGAAGCAGAGATTG 
TAGGATATTCAGAAGAAGCAGCAGTGGTCAATTGCTGCATAGACTCTTCZAACCATAATGG 
AAATGGATCGTTGTGGGGACAACAATGAGCTGGCTTGGAACTTCTGTATGATGGATACAG 
GGTTTTCTCCGTTTTTGACTGATCAGAATCTCGCGAATGAGAATCCCATAGAGTATCCGG 
AGCTATTCAATGAGTTAGCATTTGAGGACAACATCGACTTCATGTTCGATGATGGGAAGC 
ACGAGTGCTTGAACTTGGAAAATCTGGATTGTTGCGTGGTGGGAAGAGAGAGCCCACCCT 
CTTCTTCTTCACCATTGTCTTGCTTATCTACTGACTCTGCTTCATCAACAACAACAACAA 
CAACCTCGGTTTCTTGTAACTATTTGGTCTGAGAGAGAGAGCTTTGCCTTCTAGTTTGAA 
TTTCTATTTCTTCCGCTTCTTCTTCTTTTT^ 

TATTTCAGTTTCAGGGCTTGTTCGTTGGTTCTGAATAATCAATGTCTTTGCCCCTTTT^ 
AANGNTNCAAGNTNAAANAAAAAAAAAAAA 

>G979 Amino Acid Sequence (domain in AA coordinates: 63-139,165-233) 

MKKRLTTSTCSSSPSSSVSSSTTTSSPIQSEAPRPKRAKRAKKSSPSGDKSHNPTSPAST 

RRSS I YRGVTRHRWTGRFEAHLWDKSSWNS IQNKKGKQVYLGAYDSEEAAAHTYDLAALK 

YWGPDTILNFPAETYTKELEEMQRVTKEEYLASLRRQSSGFSRGVSKYRGVARHHHNGRW 

EARI GRVFGNKYLYLGTYNTQEEAAAAYDMAAI EYRGANAVTNFD I SNYIDRLKKKGVFP 

FPWQANHQEGILVEAKQEVETREAKEEPREEVKQQYVEEPPQEEEEKEEEKAEQQEAEI 

VGYSEEAAVVNCCIDSSTIMEMDRdGDlTOEIiAWNFC^ 

ELFNEIiAFEDNIDFMFDDGKHECLWLENLDCCWGRESPPSSSSPLSCLSTO 

TTSVSCNYLV* 

>G987 (1..4011) 

ATGGGTTCTTACTCAGCTGGCTTCCCTGGATCCTTGGACTGGTTTGATTTTCCCGGTTTA 

GGAAACGGATCCTATCTAAATGATCAACCTTTGTTAGATATTGGATCTGTTCCTCCTCCT 

CTAGACCCATATCCTCAACAGAATCTTGCTTCTGCGGATGCTGATTTCTCTGATTCTGTT 

TTGAAGTACATAAGCCAAGTTCTTATGGAAGAGGACATGGAAGATAAGCCTTGTATGTTT 

CATGATGCTTTATCTCTTCAAGCAGCTGAGAAGTCTCTCTATGAAGCTCTCGGCGAGAAG 

TACCCGGTTGATGATTCTGATCAGCCTCTGACTACTACTACTAGCCTTGCTCAATTGGTT 

AGTAGTCCTGGTGGTTCTTCTTATGCTTCAAGCACCACAACCACTTCCTCTGATTCACAA 

TGGAGTTTTGATTGTTTGGAGAATAATAGGCCTTCTTCTTGGTTGCAGACACCGATCCCG 

AGTAACTTCATTTTTCAGTCTACATCTACTAGAGCCAGTAGCGGTAACGCGGTTTTCGGG 

TCAAGTTTTAGCGGTGATTTGGTTTCTAATATGTTTAATGATACTGACTTGGCGTTACAA 

TTCAAGAAAGGGATGGAGGAAGCTAGTAAATTCCTTCCTAAGAGCTCTCAGTTGGTTATA 

GATAACTCTGTTCCTAACAGATTAACCGGAAAGAAGAGCCATTGGCGCGAAGAAGAACAT 

TTGACTGAAGAAAGAAGTAAGAAACAATCTGCTATTTATGTTGATGAAACTGATGAGCTT 

ACTGATATGTTTGACAATATTCTGATATTTGGCGAGGCTAAGGAAGAACCTGTATGCATT 

CTTAACGAGAGTTTCCCTAAGGAACCTGCGAAAGCTTCAACGTTTAGTAAGAGTCCTAAA 

GGCGAAAAACCGGAAGCTAGTGGTAACAGTTATACAAAAGAGACACCTGATTTGAGGACA 

ATGCTGGTTTCTTGTGCTCAAGCTGTTTCGATTAACGATCGTAGAACTGCTGACGAGCTG 

TTAAGTCGGATAAGeCAACATTCTTCATCTTACGGCGATGGAACAGAGAGATTGGCTCAT 

TATTTTGCTAACAGTCTTGAAGCACGTTTGGCTGGGATAGGTACACAGGTTTATACTGCC 

TTGTCTTCCAAGAAAACATCTACTTCTGACATGTTGAAAGCTTATCAGACATATATATCA 

GTCTGTCCGTTCAAGAAAATCGCAATCATATTCGCCAACCATAGTATTATGCGGTTGGCT 

TCAAGTGCTAATGCCAAAACCATCCACATCATAGATTTTGGAATATCTGATGGTT^ 

TGGCCTTCTCTGATTCATCGACTTGCTTGGAGACGTGGTTCATCTTGTAAGCTTCGGATA 

ACCGGTATAGAGTTGCCTCAACGTGGTTTTAGACCAGCCGAGGGAGTTATTGAGACTGGT 

CGTCGCTTGGCTAAGTATTGTCAGAAGTTCAATATTCCGTTTGAGTACAATGCGATTGCG 

CAGAAATGGGAATCAATCAAGTTGGAGGACTTGAAGCTAAAAGAAGGCGAGTTTGTTGCG 

GTAAACTCTTTATTTCGGTTTAGGAATCTTCTAGATGAGACGGTGGCAGTGCATAGCCCG 
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AGAGATACGGTTTTGAAGCTGATAAGGAAGATAAAGCCAGACGTGTTCATCCCCGGGATC 
CTCAGCGGATCCTACAACGCGCCTTTCTTTGTCAC 

TACTCATCTCTGTTTGACATGTGTGACACGAATCTAACACGGGAAGATCCAATGAGGGTT 

ATGTTTGAGAAAGAGTTCTATGGGCGGGAGATCATGAACGTGGTGGCGTGTGAGGGGACG 

GAGAGAGTGGAGAGGCCAGAGAGTTATAAGCAGTGGCAGGCGAGGGCGATGAGAGCCGGG 

TTTAGACAGATTCCGCTGGAGAAGGAACTAGTTC^GAAACTGAAGTTGATGG 

GGATACAAACCCAAAG AGTTTGATGTTGATC^GATTGTCACTGGTTG CTTCAGGG CTGG 

AAAGGTAGAATTGTATACGGTTC^TCTATTTGGGTTCCTTTCTTTTTCTATG 

GCAACTAGGGTTTTGATC^TGGATCCAAACTTCTCTGAATCTCTAAACGGCTTTGAGTAT 

TTTGATGGTAACCCTAATTTGCTTACTGATCCAATGGAAGATCAGTATCCACCACCATCT 

gatactctgttgaaatacgtgagtgagattc!:ttatggaagagagtaatggagattataag 
caatctatgttctatgattcattggctttacgaaaaactgaagaaatgttgcagcaagtc 
attactgattctcaaaat(^gtcctttagtcctgctgattcattgattac^ 

GATGCAAGCGGAAGCATCGATGAATCGGCTTATTCGGCTGATCCGCAACCTGTGAATGAA 

ATTATGGTTAAGAGTATGTTTAGTGATGCAGAATCAGCTTTACAGTTTAAGAAAGGGG^ 

GAAGAAGCTAGTAAATTCCTTCCCAATAGTGATCAATGGGTTATCAATCTGGATATCGAG 

AGATCCGAAAGGCGCGATTCGGTTAAAGAAGAGATGGGATTGGATCAGTTGAGAGTTAAG 

AAGAATCATGAAAGGGATTTTGAGGAAGTTAGGAGTAGTAAGCAATTTGCTAGTAATGTA 

GAAGATAGTAAGGTTACAGATATGTTTGATAAGGTTTTGCTTCTTGACGGTGAATGCGAT 

CCGCAAACATTGTTAGACAGCGAGATTCAAGCGATTCGGAGTAGTAAGAACATAGGAGAG 

AAAGGGAAGAAGAAGAAGAAGAAGAAGAGTCAAGTGGTTGATTTTCGTACACTTCTCACT 

CATTGTGCACAAGCC^TTTCCAC^GGAGATAAAACCACGGCTCTTGAGTTTCrGTTACAG 

ATAAGGCAACAGTCTTCGCCTCTCGGTGACGCGGGGCAAAGACTAGCTCATTGTTTCGCT 

AACGCGCTTGAAGCTCGTCTACAGGGAAGTACCGGTCCTATGATCCAGACTTATTACAAT 

GCTTTAACCTCGTCGTTGAAGGATACTGCTGCGGATACAATTAGAGCGTATCGAGTTTAT 

CTTTCTTCGTCTCCGTTTGTTACCTTGATGTATTTCTTCTCCATCTGGATGATTCTTGAT 

GTGGCTAAAGATGCTCCTGTTCTTC^TATAGTTGATTTTGGGATTCTATACGGGTTTCAA 

TGGCCGATGTTTATTCAGTCTATATCAGATCGAAAAGATGTACCGCGGAAGCTGCGGATT 

ACTGGTATCGAGCTTCCTCAGTGCGGGTTTCGGCCCGCGGAGCGAATAGAGGAGACAGGA 

CGGAGATTGGCTGAGTATTGTAAACGGTTTAATGTTCCGTTTGAGTACAAAGCCATTGCG 

TCTCAGAACTGGGAAACAATCCGGATAGAAGATCTCGATATACGACCAAACGAAGTCTTA 

GCGGTTAATGCTGGACTTAGACTCAAGAACCTTCAAGATGAAACAGGAAGCGAAGAGAAT 

TGCCCGAGAGATGCTGTCTTGAAGCTAATAAGAAACATGAACCCGGACGTTTTCATCCAC 

GCGATTGTCAACGGTTCATTCAACGCACCCTTCTTTATCTCGCGGTTTAAAGAAGCGGTT 

TACCATTACTCCGCTCTCTTCGACATGTTTGATTCGACGTTGCCTCGGGATAACAAAGAG 

AGGATTAGGTTCGAGAGGGAGTTTTACGGGAGAGAGGCTATGAACGTGATAGCGTGCGAG 

GAAGCTGATCGAGTGGAGAGGCCTGAGACTTACAGGCAATGGCAGGTTAGAATGGTTAGA 

GCCGGGTTTAAGCAGAAAACGATTAAGCCTGAGCTGGTAGAGTTGTTTAGAGGAAAGCTG 

AAGAAATGGCGTTACCATAAAGACTTTGTGGTTGATGAAAATAGTAAATGGTTGTTACAA 

GGCTGGAAAGGTCGAACTCTCTATGCTTCTTCTTGTTGGGTTCCTGCCTAG 

>G987 Amino Acid Sequence (domain in AA coordinates: 428-432,704-708) 

MGSYSAGFPGSLDWFDFPGLGNGSYLNDQPLLDIGSVPPPLDPyPQQNLASADADFSDSV 

LKYISQVLMEEDMEDKPCMFHDALSLQAAEKSLYEALGEKYPVDDSDQPLTTTTSLAQLV 

SSPGGSSYASSTTTTSSDSQWSFDCLENNRPSSWLQTPIPSNFIFQSTSTRASSGNAVFG 

S S FSGDLVSNMFNDTDIjALQFKKGMEEASKFLPKSSQLVIDNSVPNRLTGKKSHWREEEH 

IiTEERSKKQSAIYVDETDELTDMFDNILIFGEAKEQPVCILNESFPKEPAKASTFSKSPK 

GEKPEASGNSYTKETPDLRTMLVS CAQAVS INDRRTADELLSR IRQHS S S YGDGTERLAH 

YFANSLEARLAGIGTQVYTALSSKKTSTSDMLKAYQTYISVCPFKKIAIIFAl^SIMRIA 

SSANAKTIHIIDFGISDGFQWPSLIHRLAWRRGSSCKLRITGIELPQRGFRPAEGVIETG 

RRLAKYCQKFNIPFEYNAIAQKWESIKLEDLKLKEGEFVAVNSLFRFRNIiLDETV 

RDTVLKLIRKIKPDVFIPGILSGSYNAPFFVTRFREVLFHYSSLFDMCDTNLTREDPMRV 

MFEKEFYGREIMNWACEGTERVERPESYKQWQARAMRAGFRQIPLEKELVQKLKLMVES 

GYKPKEFDVDQDCHWLLQGWKGRIVYGSSIWVPFFFYVGRATRVLIMDPNFSESLNGFEY 

FDGNPNLLTDPMEDQYPPPSDTLLKYVSEI LMEESNGDY KQSMF YD SLALRKTEEMLQQV 

ITDSQNQSFSPADSLITNSWDASGSIDESAYSADPQPVNEIMVKSMFSDAESALQFKKGV 

EEASKFLPNSDQWVINLDIERSERRDSVKEEMGLDQLRVKKNHERDFEEVRSSKQFASNV 

EDSKVTDMFDK\njLLDGECDPQTLLDSEIQAIRSSKNIGEKGKKKKKKKSQVVDFRTLLT 
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HCAQAISTGDKTTALEFLLQIRQQSSPLGDAGQRLAHCFANALEARLQGSTGPMIQTYYN 
ALTSSLKDTAADTIRAYRVYLSSSPFVTIi^FFSlWMlLDVAKDAPVLHIVDFGILYGFQ 
WPMFIQSISDRKDVPRKLRITGIELPQCGFRPAERIEETGRRLAEYCKRFNVPFEYKAIA 
SQNWETIRIEDLDIRPNEVLAVNAGLRLK^QDETGSEENCPRDAVLKLIRNMNPDVFIH 
AIWGSFNAPFFISRFKEAVYHYSALFDMFDSTIiPRDNKERIRFEREFYGREAMNVlACE 
EADRVERPETYRQWQVRMVRAGFKQKTIKPELVELFRGKLKKWRYHKDFVVD 

GWKGRTL YAS S CWVPA* 
>G993 (6.. 1091) 

CAAATATGGAATACAGCTGTGTAGACGACAGTAGTACAACGTCAGAATCTCTCTCCATCT 

CTACTACTCCAAAGCCGACAACGACGACGGAGAAGAAACTCTCTTCTCCGCCGGCGACGT 

CGATGCGTCTCTACAGAATGGGAAGCGGCGGAAGCAGCGTCGTTTTGGATTCAGAGAACG 

GCGTCGAGACCGAGTCACGTAAGCTTCCTTCGTCGAAATATAAAGGCGTTGTGCCTCAGC 

CTAACGGAAGATGGGGAGCT(^GATTTACGAGAAGCATCAGCGAGTTTGGCTCGGTACTT 

TCAACGAGGAAGAAGAAGCTGCGTCTTCTTACGACATCGCCGTGAGGAGATTCCGCGGCC 

GCGACGCCGTCACTAACTTCAAATCTCAAGTTGATGGAAACGACGCCGAATCGGCTTTTC 

TTGACGCTCATTCTAAAGCTGAGATCGTGGATATGTTGAGGAAACACACTTACGCCGATG 

AGTTTGAGCAGAGTAGACGGAAGTTTGTTAACGGCGACGGAAAACGCTCTGGGTTGGAGA 

CGGCGACGTACGGAAACGACGCTGTTTTGAGAGCGCGTGAGGTTTTGTTCGAGAAGACTG 

TTACGCCGAGCGACGTCGGGAAGCTGAACCGTTTAGTGATACCGAAACAACACGCGGAGA 

AGCATTTTCCGTTACCGGCGATGACGACGGCGATGGGGATGAATCCGTCTCCGACGAAAG 

GCGTTTTGATTAACTTGGAAGATAGAACAGGGAAAGTGTGGCGGTTCCGTTACAGTTACT 

GGAACAGCAGTCAAAGTTACGTGTTGACCAAGGGCTGGAGCCGGTTCGTTAAAGAGAAGA 

ATCTTCGAGCCGGTGATGTGGTTTGTTTCGAGAGATCAACCGGACCAGACCGGCAATTGT 

ATATCCACTGGAAAGTCCGGTCTAGTCCGGTTCAGACTGTGGTTAGGCTATTCGGAGTCA 

ACATTTTCAATGTGAGTAACGAGAAACCAAACGACGTCGCAGTAGAGTGTGTTGGCAAGA 

AGAGATCTCGGGAAGATGATTTGTTTTCTTTAGGGTO 

ACATCTTGTGACAAATTCTTTTTTTTTGGTTTTT^ 

ATATTTTGTATTGAAATGACAAGTTGTAAATTAGGAC 

GACAAAATAGTTTTTGTTTAAAAAAAAAAAAAAAAAAAA 

>G993 Amino Acid Sequence (domain in AA coordinates: 69-134) 

MEYSCVDDSSTTSESLSISTTPKPTTTTEKKLSSPPATSMRLYRMGSGGSSVVLDSENGV 

ETESRKLPSSKYKGVVPQPNGRWGAQIYEKHQRVWLGTFNEEEEAASSYDIAVRRFRGRD 

AVTNFKSQVDGNDAESAFLDAHSKAEIVDMLRKHTYADEFEQSRRKFVNGDGKRSGLETA 

TYGNDAVLRAREVLFEKTVTPSDVGKLNRLVIPKQHAEKHFP 

LINLEDRTGKVWRFRYSYWNSSQSYVLTKGWSRFVKEKNLRAGDVVCFERSTGPDRQLYI 

HWKVRSSPVQTVVRLFGVNIFNVSNEKPNDVAVECV 

L* 

>G681 (1..804) 

ATGGGGAGGACGACATGGTTCGACGTCGACGGGATGAAGAAAGGAGAGTGGACGGCAGAG 
GAAGACCAGAAGCTCGGCGCTTACATCAACGAGCATGGCGTTTGTGATTGGCGTTCCCTC 
CCCAAAAGAGCTGGTTTGCAGAGATGTGGAAAGAGCTGCAGATTAAGGTGGCTTAACTAT 
CTAAAGCCTGGGATTAGAAGAGGCAAATTCACTCCTCAAGAAGAAGAAGAAATCATCCAA 
CTTCATGCTGTTCTCGGAAACAGGTGGGCAGCCATGGCGAAGAAGATGCAGAATCGAACA 
GACAATGATATCAAGAACCATTGGAACTCTTGTCTCAAGAAAAGACTTTCGAGAAAGGGA 
ATCGACCCTATGACCCACGAGCCCATCATCAAACACCTCACCGTCAATACCACTAACGCA 
GATTGTGGTAACTCTTCCACCACGACGTCCCCGTCGACGACGGAAAGCTCTCCTTCCTCC 
GGCTCGTCTCGTCTTCTTAACAAACTCGCCGCAGGTATCTCATCTAGACAACATAGTCTC 
GATAGGATCAAGTAeATCTTGTCGAATTCAATAATCGAAAGCAGTGATCAAGCAAAAGAG 
GAAGAAGAAAAAGAAGAAGAAGAAGAAGAAAGAGATTCAATGATGGGTCAGAAGATTGAC 
GGTAGTGAAGGAGAAGATATTCAGATTTGGGGCGAGGAGGAAGTTAGGCGTTTAATGGAG 
ATTGATGCAATGGATATGTACGAGATGACTTCGTACGACGCTGTCATGTACGAGAGTAGT 
CACATACTTGATCATCTCTTTTGACTTAATATAGTGTGACTGTGTGAGTGCATGCATGTT 

>G681 Amino Acid Sequence (domain in AA coordinates : 14-120) 
MGRTTWFDVDGMKKGEWTAEEDQKLGAYINEHGVCDWRSLPKRAGLQRCGKSCRLRWLNY 
LKPG IRRGKFTPQEEEE I IQLHAVLGNRWAAMAKKMQNRTDND IKNHWNSCLKKRL SRKG 
IDPMTHEPIIKHLTVOTTNADCGNSSTTTSPSTC 

DRIKYILSNSIIESSDQAKEEEEKEEEEEERDSMMGQKIDGSEGEDIQIWGEEEVRRLME 
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IDAMDMYEMTSYDAVMYESSHILDHLF* 
>G1482 (1..996) 

ATGAAGATCAGGTGCGACGTCTGCGATAAAGAAGAAGCGTCGGTGTTTTGCACGGCCGAC 
GAAGCATCTCTCTGCGGCGGCTGCGACCACCAAGTCCACCACGCTAACAAACTCGCCTCT 

AAACATCTCCGTTTCTCTCTCCT 

GACATCTGTCAGGATAAAAAAGCTCTGTTGTTCTGTCAACAAGATAGAGCTATTTTATGC 
AAAGATTGCGATTCATCGATCCACGCTGCGAACGAACACACAAAGAAAC^CGATAGGTTT 

CTTCTTACAGGGGTTAAGCTCTCTGCAACATC^ 
TCTTCTTCTTCTTCAAGCAACCAAGAT^ 

CCTCCTCTC^^GAAACCTCTCTCAGCTCCTCCT<^GAGCAACAAGATCCAACCCTTTTCG 

AAGATCAACGGCGGTGATGCGTCGGTGAATCAGTGGGGATCCACAAGCACGATTTCTGAG 

TATTTGATGGATACGTTACCTGGTTGGCACGTTGAGGATTTCCTCGATTCCTCTCTTCCT 

ACTTATGGTTTCTCTAAGAGTGGTGATGATGATGGAGTGTTACCATATATGGAACCAGAA 

GATGACAACAACACTAAGAGAAACAACAACAACAACAACAACAACAACAACAATACAGTG 

TCACTTCCATCTAAGAATTTAGGGATTTGGGTCCCTCAGATTCCACAAACTCTTCCTTCT 

TCATACCCAAATCAATACTTTTCTCAAGACAACAACATACAGTTTGGGATGTACAACAAA 

GAAACATCACCAGAAGTAGTGTCTTTTGCTCCAATACAAAACATGAAAC^ 

AACAACAAGAGATGGTATGATGATGGTGGCTTCACTGTCCCACAGATCACTCCTCCTCCT 

CTTTCTTCTAATAAAAAGTTTAGATCTTTCTGGTAA 

>G1482 Amino Acid Sequence (domain in aa coordinates: 5-63) 
MKIRCDVCDKEEASVFCTADEASLCGGCDHQVHHANKLASKHLRPSLLYPSSSNTSSPLC 
D I CQDKKALLFCQQDRAI LCKDCDS S IHAANEHTKKHDRFLLTGVKLS ATS S VYKPTS KS 
SSSSSSNQDFSVPGSSISNPPPLKKPLSAPPQSNKIQPFSKINGGDASVNQWGSTSTISE 
YLMDTLPGWHVEDFLDSSLPTYGFSKSGDDDGVLPYMEPEDD^ 

SLPSKLnjGIWVPQIPQTLPSSYPNQYFSQDimiQFGMYNKETSPEVVSFAPIQNMKQQGQ 

NWKRWYDDGGFTVPQITPPPLSSNKKFRSFW* 

>G225 (157.. 441) 

CTCTCTCTCTCACTCTTTTCTTTTCCGAGAACCCAACAAAAAAAAAGCTACTATTAATCC 

TTCCCCTCGTGAGGAAATCATTTCTTCnTGTTTCTCGAGATTTATTCTCTTTCTCTCTCT 

CTTTCTCTGTGTGTTTCGTGTCTTCAGATTAGTTCGATGTTTCGTTCAGACAAGGCGGAA 

AAAATGGATAAACGACGACGGAGACAGAGCAAAGCCAAGGCTTCTTGTTCCGAAGAGGTG 

AGTAGTATCGAATGGGAAGCTGTGAAGATGTCAGAAGAAGAAGAAGATCTCATTTCTCGG 

ATGTATAAACTCGTTGGCGACAGGTGGGAGTTGATCGCCGGAAGGATCCCGGGACGGACG 

CCGGAGGAGATAGAGAGATATO?GGCTTATGAAACACGGCGTCGTTTTTGCCAAGAGACGA 

AGAGACTTTTTTAGGAAATGATTTTTTTTGTTTGGATTAAAAGAAAATTTTCCTCTCCTT 

AATTCACAAGACAAGATyUUVAAGGAAATGTACCTGTCCTTGAATTACTATTTTGGAATGT 

ATAATTATCTATATATATAAGAAGAAAAAATTGCTTAGGAATTT 

>G225 Amino Acid Sequence (domain in AA coordinates: 39-76) 

MFRSDKAEKMDKRRRRQS KAKASCSEEVS S I E WEAVKMSEEEEDL I S RMYKLVGDRWEL I 

AGRIPGRTPEEIERYWLMKHGWFANRRRDFFRK* 

>G226 (10.. 348) 

CCAGTAGTTATGGATAATACCAACCGTCTTCGTCITCGTCGCGGTCCCAGTCTTAGGCAA 
ACTAAGTTCACTCGATCCCGATATGACTCTGAAGAAGTGAGTAGCATCGAATGGGAGTTT 
ATCAGTATGACCGAACAAGAAGAAGATCTCATCTCTCGAATGTACAGACTTGTCGGTAAT 
AGGTGGGATTTAATAGCAGGAAGAGTCGTAGGAAGAAAGGCAAATGAGATTGAGAGATAC 
TGGATTATGAGAAACTCTGACTATTTTTCTCACAAACGACGACGTCTTAATAATTCTCCC 
TTTTTTTCTACTTCTCCTCTTAATCTCCAAGAAAATCTAAAATTGTAAAGAAATCAAAAT 
AAAAGCTTTCAATCATAAAAGTAGAACAAATCTTGAATGTCTTCTCA 

>G226 Amino Acid Sequence (domain in AA coordinates: 28-78) 
MDNTNRLRLRRGPSLRQTKFTRSRYDSEEVSSIEWEFISMTEQEEDLISRMYRLVGNRWD 

LIAGRWGRKANEIERYWIMRNSDYFSHKRRRLNN 

>G9 (81-. 1139) 

GTGTTTCTTCTTTCTGCTAAAAGGTTATAATTTTTGTTTCTTGGTTTGGTGAGAATCTTC 
AAGAAACTGAAACAAAGAAAATGGATTCTAGTTGCATAGACGAGATAAGTTCCTCCACTT 
CAGAATCTTTCTCCGCCACCACCGCCAAGAAGCTCTCTCCTCCTCCCGCGGCGGCGTTAC 
GCCTCTACCGGATGGGAAGCGGCGGGAGCAGCGTCGTGTTGGATCCCGAGAACGGCCTAG 
AGACGGAGTCACGAAAGCTACCATCTTCAAAATACAAAGGTGTTGTTCCTCAGCCTAACG 
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GAAGATGGGGAGCTCAGATCTACGAGAAGCACCAACGAGTATGGCTCGGGACTTTCAACG 

AGCAAGAAGAAGCTGCTCGTTCCTACGACATCGCAGCTTGTAGATTCCGTGGCCGCGACG 

CCGTCGTCAACTTCAAGAACGTTCTGGAAGACGGCGATTTAGCTTTTCTTGAAGCTCACT 

CAAAGGCCGAGATCGTCGACATGTTGAGAAAACACACTTACGCCGACGAGCTTGAACAGA 

ACAATAAACGGCAGTTGTTTCTCTCCGTCGACGCTAACGGAAAACGTAACGGATCGAGTA 

CTACTCAAAACGACAAAGTTTTAAAGACGTGTGAAGTTCTTTTCGAGAAGGCTGTTACAC 

CTAGCGACGTTGGGAAGCTAAACCGTCTCGTGATACCTAAACAACACGCCGAGAAACACT 

TTCCGTTACCGTCACCGTCACCGGCAGTGACTAAAGGAGTTTTGATCAACTTCGAAGACG 

TTAACGGTAAAGTGTGGAGGTTCCGTTACTCATACTGGAACAGTAGTCAAAGTTACGTGT 

TGACCAAGGGATGGAGTCGATTCGTCAAGGAGAAGAATCTTCGAGCCGGTGATGTTGTTA 

CTTTCGAGAGATCGACCGGACTAGAGCGGCAGTTATATATTGATTGGAAAGTTCGGTCTG 

GTCCGAGAGAAAACCCGGTTCAGGTGGTGGTTCGGCTTTTCGGAGTTGATATCTTTAATG 

TGACCACCGTGAAGCCAAACGACGTCGTGGCCGTTTGCGGTGGAAAGAGATCTCGAGATG 

TTGATGATATGTTTGCGTTACGGTGTTCCAAGAAGCAGGCGATAATCAATGCTTTGTGAC 

ATATTTCCTTTTCCGATTTTATGCTTTCGTTTTTTAATTTTTTTTTTTC 

AGGTTGTGATTCATGCTAGGTTGTATTTAGGAAAAGAGATAAGACC 

>G9 Amino Acid Sequence (domain in AA coordinates: 62-127) 

MDSSCIDEISSSTSESPSATTAIOCLSPPPAAALRLyi^GSGGSSVVIiDPENGLETESRKL 

PSSKYKGVVPQPNGRWGAQIYEKHQRVWLGTFNEQEEAARSYDIAACRFRGRDAVVNFKN 

VLEDGDLAFLEAHSKAEIVDMLRKHTYADEL^ 

LKTCEVLFEKAVTPSDVGKLNRLVI PKQHAEKHFPLPSPS PAVTKGVLINFEDVNGKVWR 
FRYSYWNSSQSYVLTKGWSRFVKEKNLRAGDVVTFERSTGLERQLYIDWKVRSGPRENPV 
QVWRLFGVDI FNVTTVKPND WAVCGGKRSRDVDDMFALRCSKKQAI INAL* 
>G1040 (51.. 863) 

CTTTGATCTCCACTATTTAAGTAGACAAGAATCATAAAGAAAATAGTGAGATGATGATGT 
TAGAGTCAAGAAACAGTATGAGAGCTTCAAACTCAGTCCCAGATCTGTCTCTTCAGATCA 
GTCTTCCTAACTATCACGCCGGAAAACCTCTTCACGGCGGTGACCGGAGCTCCACAAGCA 
GTGATTCTGGAAGCAGCCTCAGTGACCTGAGCCATGAGAACAACTTCTTCAACAAACCTC 
TCTTGAGCTTAGGATTTGACCATCATCATCAAAGGCGCTC^^C^TGTTCC^CCTCA^ 
TCTACGGTCGAGATTTCAAGAGAAGCTCATCATCAATGGTTGGTCTTAAACGAAGCATTC 
GTGCTCCAAGAATGAGATGGACTTCTACTCTTCATGCTCACTTCGTCCATGCTGTTCAAC 
TTCTTGGCGGCCATGAAAGAGCAACGCCTAAATCAGTGTTGGAGCTCATGAATGTGAAGG 
ATCTAACCCTAGCTCATGTCAAGAGTCACTTGCAGATGTATAGAACAGTGAAATGCACTG 
ATAAAGGATCACCAGGAGAAGGAAAGGTAGAGAAAGAGGCAGAGCAGAGGATAGAGGACA 
ATAATAATAATGAAGAAGCTGATGAAGGAACTGACACAAATTCGCCAAACTCATCATCTG 
TGC^AAAGACCCAAAGAGCTTCATGGTCATCGACAAAGGAAGTATCTAGGAGCATATCTA 
CACAAGCATATTCTCACTTGGGAACAACTCATCACAGTAAGGCCAATGAAGAGAAAGAGG 
ATACCAACATTCATCTCAATTTGGATTTCACATTGGGCGGCCTAGTTGGGGGATGGAATA 
TGCGGAACCCTCCAGTGATTTAACCCTTCTCAAGTGCTAATTGCCTTAAGCTACAACAAA 
TAAGTCAGCTTAGGTTACCAGTTTTAACATAATTTTAACTTGTTTTGATCATATGAGCTT 
CGGAAGAAT(^TATTATCATCATATATGAACTTCTTTCCAAGAATGTTCTATGAGTTTTT 
TGATATGTATAATCAAGAGAATCGTTTGAAGTAAAAA 

>G1040 Amino Acid Sequence (domain in AA coordinates: 109-158) 
MMMLESRNSMRASNSVPDLSLQISLPNYHAGKPLHGGDRSSTSSDSGSSLSDLSHENNFF 
NKPLLSLGFDHHHQRRSNMFQPQIYGRDFKRSSSSMVGLKRSIRAPRMRWTSTLHAHFVH 
AVQLLGGHERATPKSVLELMNVKDLTLAHVKSHLQM 

IEDNNNNEEADEGTDTNS PNS S S VQKTQRAS WS STKE VSRS I STQAYS HLGTTHHTKANE 
EKEDTNIHLNLDFTfiGGLVGGWNMRNPPVI * 
>G2114 (64.. 1311) 

ATAAAACGAAACCCTATACATATAAACTAAGAGCGAGAAAGACAGCTAGAGAGAGAGAGA 
GAGATGAAGT^AATGGTTGGGATTTTCATTGACACCTCCTTTGAGAATCTGCAATAGTGAA 
GAAGAAGAACTTAGGCATGACGGTTCCGATGTTTGGAGATATGATATTAACTTTGATCAT 
CATCATCATGATGAAGACGTTCC7VAAGGTGGAAGATCTCCTCTCAAACTCTCATCAAACC 
GAGTATCCTATAAACCATAACCAAACCAATGTCAACTGCACCACTGTGGTTAACAGGTTA 
AACCCACCCGGTTACCTTCTCCACGACCAAACCGTAGTTACACCACATTACCCGAACCTA 
GATCCGAACCTTAGCAATGATTATGGAGGTTTTGAGAGGGTCGGTTCGGTCTCGGTTTTC 
AAATCXTGGTTAGAGCAAGGCACTCCAGCATTCCCACTCTCGAGTCATTACGTTACTGAA 
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GAGGCTGGTACGAGCAATAATATTAGTCATTTTAGTAACGAAGAGACTGGTTATAACACC 
AATGGCTCAATGCTATCATTGGCTTTGAGCCATGGGGCTTGTTCTGATTTGATCAACGAA 
TCGAATGTATCCGCACGGGTCGAAGAACCGGTTAAGGTAGATGAGAAGCGGAAGAGATTG 
GTTGTTAAACCTCAGGTAAAGGAATCCGTTCCTCGGAAGTCGGTTGATAGTTATGGACAA 
AGAACTTCTCAGTATCGTGGAGTTACAAGGCATAGATGGACAGGGAGATATGAAGCTCAC 
TTATGGGATAATAGCTGTAAGAAGGAGGGACAGACAAGGAGAGGAAGACAAGTGTATCTT 
GGAGGGTATGATGAGGAGGAGAAAGCAGCGAGGGCATATGATTTAGCGGCTCTGAAGTAT 
TGGGGTCCTACCACTCACTTAAATTTCCCTTTGAGTAATTACGAAAAGGAGATCGAGGAA 
CT CAATAACATGAAT CGGCAAGAATTTGTTGCCATGTTGAGG AGGAATAGCAG CGGGTTT 
TCGAGGGGAGCTTCCGTGTATAGAGGAGTTACAAGGCATCATCAACATGGAAGGTGGCAA 
GCCAGAATTGGAAGAGTTGCTGGAAACAAGGACTTGTACCTTGGAACATTTAGCACGCAA 
GAAGAAG CAGCGG AGG CGTACGATATCGCGG CAATTAAATTCAGAGGCCTAAACGCTGTA 
ACCAATTTCGATATAAATAGATATGACGTGAAGAGGATATGTTCAAGCTCAACGATTGTT 
GATAGCGACCAGGCCAAACATTCTCCCACCAGCTCTGGCGCCGGCCACTAACCGACACCG 
TAAACTCCTCGCCGGAGAGACTATTCCCACGTACGGTTGGTTTGAGGAAATAAGTTCGTC 
CAGTCTGTTTAATCATTTATGGTTTAATAAACATATATTCCTAAGTAATTGAGGCCGGTC 
TACATATATACAACTTTTTTAGCAAATTAAGTTATCAGAATCC^CTATATATTATO 

>G2114 Amino Acid Sequence (conserved domain in AA coordinates : 221-297 , 323-393) 

MKKWLGFSLTPPLRICNSEEEELRHDGSDVWRYDINFDHHHHDEDVPKVEDIiLSNSHQTE 

YPINHNQTNWCTTVVNRLNPPGYLL^ 

SWLEQGTPAFPLSSHyVTEEAGTSNNISHFSNEETGYNTNGSMLSLALSHGACSDLIXJES 
NVS AR VEEP VTCVI)EKRKRLVVKPQVKE S VT>RKS VI) S YGQRTSQYRGVTRHRWTGRYEAHL 
WDNSCKKEGQTRRGRQWLGGYDEEEKAARAYDLAAL 

NNMNRQEFVAMLRRNS SGF SRGASVYRGVTRHHQHGRWQAR IGRVAGNKDLYLGTFSTQE 
EAAEAYDIAAI KFRGLNAVTNFD INRYDVKRI CS S ST I VDSDQAKHS PTS SGAGH* 
>G450 (65.. 751) 

GAGTTATCGAGAGAGAGAGAAAACATATTCTGATTTAAGACATATATAGACAGCAAGAAG 
AGATATGAACCTTAAGGAGACGGAGCTTTGTCTTGGCCTCCCCGGAGGCACTGAAACCGT 
TGAAAGTCCGGCCAAGTCGGGTGTTGGGAACAAGAGAGGCTTCTCCGAGACCGTTGATCT 
CAAACTTAATCTTCAATCTAACAAACAAGGACATGTGGATCTCAACACTAATGGAGCTCC 
CAAGGAGAAGACCTTCCTTAAAGACCCTTCTAAGCCTCCTGCTAAAGCACAAGTGGTGGG 
TTGGCCACCGGTGAGGAACTACCGGAAA7\ATGTTATGGCTAATCAGAAGAGCGGCGAAGC 
AGAGGAGGCAATGAGTAGTGGTGGAGGAACCGTCGCCTTTGTGAAGGTTTCCATGGATGG 
AGCTCCTTATCTTCGGAAGGTTGACCTCAAGATGTACACCAGCTACAAGGATCTCTCTGA 
TGCCTTGGCCAAAATGTTCAGCTCCTTTACCATGGGGAGTTATGGAGCACAAGGGATGAT 
AGATTTCATGAACGAGAGTAAAGTGATGGATCTGTTGAACAGTTCTGAGTATGTTCCAAG 
CTACGAGGACAAAGATGGTGACTGGATGCTCGTTGGTGATGTCCCCTGGCCGATGTTTGT 
CGAGTCATGCAAACGTTTGCGCATAATGAAAGGATCCGAAGCAATTGGACTTGCTCCAAG 
AGC^TGGAGAAGTTCAAGAACAGATCATGAACAAAAAAAAAAGAGGAC^ATATGCATTG 
ATTTTTTTTTTTTTGGTATTGTTATGATCATGTGTTTTAATTTAAAATATAGGAAGGATA 
TAGGAAAATATAATTGTTTACAAAAAAATAACTTTAAATATGTCTTTTTTTTTTTTTTGA 
AATTAGTCTGTGTTTTTGTTTTCATCTCTTAATTAGTAGAAATCATTTTTTAATATGTAA 
TTGTGATAGTAAATCTATAGAGTTCGTA 

>G450 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNLKETELCLGLPGGTETVESPAKSGVGNKRGFSETTO 

EKTFLKDPSKPPAKAQWGWPPVRNYRKNVMM 

PYLRKTOLKMYTSYKDLSDALAKMFSSFTMGSYGAQGM 

EDKDGDWMLVGDVPWPMFV^SCKRLRIMKGSEAIGLAPRAMEKFKNRS* 

>G584 (40.. 1809) 

AAAAAGTCTTOTCTTTTATAACTACGTCAGAGAACTGTTATGTCTCCGACGAATGTTCAA 
GTAACCGATTACCATCTCAACCAATC7VAAAACGGATACAACAAATCTCTGGTCAACCGAC 
GACGATGCATCGGTAATGGAAGCTTTCATCGGCGGCGGCTCCGATCATTCTTCTCTTTTT 
CCTCCACTTCCTCCTCCTCCTCTTCCTCAAGTCAACGAAGATAATCTCCAGC71ACGTCTC 
CAAGCTTTAATCGAAGGAGCAAACGAGAACTGGACTTACGCCGTGTTCTGGCAATCATCT 
CACGGTTTCGCCGGAGAAGACAACAACAACAACAACACAGTGTTGTTAGGTTGGGGAGAT 
GGTTATTACAAAGGAGAAGAAGAGAAGTCTAGAAAGAAGAAATCAAATCCAGCTAGTGCA 
GCTGAACAAGAGCATCGTAAGAGAGTGATTAGAGAGCTCAACTCTTTAATCTCCGGTGGT 
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GTAGGAGGAGGAGATGAAGCTGGAGATGAAGAAGTTACAGATACTGAATGGTTCTTCTTA 
GTTTC^TGACA^GAGCTTTGTCAAGGGTACT^ 

TCAGACACGATTTGGTTATCTGGTTCTAATGCTTTAGCTGGATCAAGTTGTGAGAGAGCT 

CGTCAAGGTCAGATTTATGGGTTACAAACAATGGTGTGTGTAGCGACAGAGAATGGTGTC 

GTTGAGCTTGGTTCGTCGGAGATTATTCATCAAAGTTCAGATCTTGTTGATAAAGTTGAC 

ACCTTTTTCAATTTTAACAATGGTGGTGGTGAATTTGGTTCTTGGGCG 

CCAGATCAAGGAGAGAATGATCGAGGTTTGTGGATTAGTGAACCTAATGGTGTTGACTCT 

GGTCTTGTAGCTGCTCCGGTGATGAATAATGGTGGAAATGACTCAACTTCTAATTCTGAT 

TCTCAACCAATTTCTAAGCTTTGTAATGGAAGCTCTGTTGAAAACCCTAACCCTAAAGTT 

CTGAAATOTTGTGAAATGGTGAATTTCAAGAATGGGATTGAGAATGGTCAAGAAGAAGAT 

AGTAGTAATAAGAAGAGATCACCGGTTTCGAATAATGAAGAAGGGATGCTTTCTTTTACC 

TCTGTTCTTCCATGTGACTCGAATCACTCTGATCTTGAAGCTTCAGTGGCTAAAGAAGCT 

GAGAGTAACAGAGTTGTGGTTGAACCGGAGAAGAAACCGAGGAAACGAGGGAGAAAACCG 

GCGAATGGAAGAGAAGAGCCTTTGAATCATGTAGAGGCAGAGAGACAGAGAAGAGAGAAG 

TTGAATCAGAGATTCTATTCTTTAAGAGCTGTGGTTCCTAATGTGTCTAAGATGGATAAA 

GCn^CTCTATTAGGAGATGCTATTTCGTATATC^GTGAGCTTAAGTCTAAGTTGCAAAAG 

GCTGAATCTGATAAAGAAGAGTTGCAGAAGCAGATTGATGTGATGAATAAAGAAGCGGGA 

AATGCGAAAAGTTCGGTAAAAGATCGAAAATGTTTGAATCAAGAATCGAGTGTGTTGATA 

GAGATGGAGGTTGATGTGAAGATTATTGGTTGGGATGCAATGATAAGGATTCAATGTAGT 

AAGAGGAATCATCCTGGTGCTAAGTTCATGGAAGGACTTAAGGAGTTGGATTTGGAAGTG 

AATCATGCGAGTTTATCGGTAGTGAATGATCTTATGATCCAACAAGCGACTGTGAAAATG 

GGGAATCAGTTTTTCACGCAAGATCAACTCAAGGTTGCTCTAACGGAGAAAGTTGGAGAA 

TGTCCATGAATTGAAGTCAGCATCTTTAGGGCTAATACACCGGAGAATACTGCGAAAAGT 

CGAAAACAACGATCATAGTATAAGCCGCGGTAAAAAGTGTTAAACCTTTCACACAAGTTT 

CTCTAGTGAATGTAGTTGTAAACTCTATTGTGTAAGGGTAATTTTGTAGTACCCACTTGT 

TGCTATTGAATGCTTGTTAGAGAGGATTCTTAGTGTAGTATATGATTAGGTTGGGGTTTG 

TTGTTTCATGAGATAAATAAATGTGTTTGATCAATGGTTAAGTCTTTGGTTTGTTGGTGT 

ATGTATGTAAATAAGGCTTTTGTTAGAAATAAGACAAATGGGACTGAAGTTGGAGTTTAA 

AA 

>G584 Amino Acid Sequence (domain in AA coordinates: 401-494) 
MSPTNVQVTDYHLNQSKTDTTNLWSTDDDASVMEAF IGGGSDHS SLFPPLPPPPIjPQWE 
DNLQQRLQALIEGANENWTYAVFWQSSHGFAGEDNN^^ 

KSNPASAAEQEHRKRVIRELNSLISGGVGGGDEAGDEEVTDTEWFFLVSMTQSFVKGTGIi 
PGQAFSNSDTIWLSGSNALAGSSCERARQGQIYGLQT1WCVATENGVVELGSSEIIHQSS 
DLVDKVDTFFNFNNGGGEFGSWAFNLNPDQGEOT 

DSTSNSDSQPISKLCNGSSVENPNPKVLKSCEMVNFKNGIENGQEEDSSNKKRSPVSNNE 

EGMLSFTSVLPCDSNHSDLEASVAKEAESNRVVVEPEKKPRKKGRKPANGRE 

ERQRREKIiNQRFYSLRAVVPNVSKMDKASLLGDAISYISELKSKLQKAESDKEELQKQID 

VMNKEAGNAJ<SSVKDRKCLNQESSVLIEMEVDVKIIGWDAM 

KELDLEVNHASLS WNDLMI QQATVKMGNQF FTQDQLKVALTEKVGE CP * 

>G668 (1..1056) 

ATGGGAAGACCACCTTGCTGTGAAAAGATTGGAGTGAAGAAAGGGCCATGGACACCAGAG 
GAAGACATCATCTTGGTTTCTTACATCCAAGAACATGGTCCTGGAAACTGGAGATCTGTC 
CCAACACACACAGGTTTAAGATGTAGCAAGAGCTGCAGATTGAGATGGACTAATTATCTT 
CGACCCGGTATTAAGCGTGGAAATTTTACTGAGCATGAAGAGAAGACAATTGTTCATCTT 
CAAGCCCTTTTAGGCAAC^GATGGGCAGCCATAGCATC^TACCTTCCAGAAAGGACAGAC 
AATGATATAAAGAACTATTGGAACACTCACTTGAAGAAGAAGCTCAAAAAGATTAATGAA 
TCTGGTGAAGAAGAfAATGATGGTGTCTCTTCATCAAACACTAGTTCACAAAAGAACCAT 
CAAAGCACTAACAAAGGTCAATGGGAAAGAAGACTTGAGACAGACATTAACATGGCAAAA 
CAAGCTCTTTGTGAGGCCTTGTCTTTAGACATVACCATCATCCACTCTTTCATCATCTTCA 
TCATTACCGACACCAGTAATCACACAACAAAACATCCGTAACTTCTCATCAGCTTTGCTT 
GACCGTTGTTATGATCCATCCTCTTCTTCTTCATCTACGACAACCACCACTACAAGCAAC 
ACTACTAATCCATACCCATCAGGGGTATATGCGTCAAGTGCTGAGAACATCGCCCGGTTG 
CTTCAAGATTTCATGAAAGACACACCCAAGGCTTTAACTTTATCATCTTCATCTCCGGTT 
TCAGAGACTGGACCACTCACTGCTGCAGTCTCGGAAGAAGGTGGAGAAGGGTTTGAACAA 
TCTTTCTTCAGCTTCAATTCAATGGACGAAACTCAAAACTTGACTCAGGAGACAAGCTTC 
TTCCATGATCAAGTGATCAAACCGGAAATAACAATGGACCAAGATCATGGTCTAATATCA 
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CAAGGGTCTCTGTCTTTGTTTGAGAAATGGTTATTTGATGAGCAT^AGCCACGAGATGGTT 
GGTATGGCACTAGCAGGACAAGAAGGGATGTTCTAG 

>G668 Amino Acid Sequence (domain in AA coordinates: 13-113) 

MGRPPCCEKIGVKKGPWTPEEDIILVSYIQEHGPGNWRSVPTHTGLRCSKSCRLRWTNYL 

RPGIKRGNFTEHEEKTIVHLQALLGNRWAAIASYLPERTD1TOIKNYWNTH 

SGEEDNIX3VSSSNTSSQKNHQSTNKGQWERRLQTDINMAKQALCEALSLDKPSSTLSSSS 

SLPTPVITQQNIRNFSSALLDRCYDPSSSSSSTTTTTTSNTTO 

LQDFMKDTPKALTLSSSSPVSETGPLTAAVSEEGGEGFEQSPFSFNSMDETQNLTQETSF 
FHDQVIKPEITMDQDHGLISQGSLSLFEKWLFDEQSHEMVGMALAGQEGMF* 

>G1050 (23.. 1582) 

TTCCCCATTTCAGAAAATCAAAATGGGTGGTGGTGGTGATACAACAGATACCAATATGAT 
GCAGAGAGTTAATTCTTCTTCTGGTACATCGTCTTC^ 

CTTGAATCCTGCTCTTATCCGCTCTCACCATCACTTCCGTCACCCTTTCACCGGAGCTCC 

TCCACCGCCGATTCCACCCATTTCTCCTTACTCTCAGATCCCGGCGACTTTACAACCTAG 

ACATTCTCGCTCTATGTCGCAACCGTCTTCTTTCTTC 

AAATCCTTCTtX^CCG^ 

TCCTTCGTTGCCTCCGTCACCGTTTACGATGTGTC^TTCTTCTAGCTCTAGGAACGCCGG 
AGATGGAGAGAATCTACCTCCGAGAAAGTCGCATAGGCGTTCGAATAGTGATGTTACTTT 

TQQGTTTAGTTCAATQATGTCTCAQAATCAA 

ATCGATCTCTGGTGAAGATACATCAGATTGGTCTAATTTGGTGAAGAAAGAACCGAGAGA 

AGGCTTCTACAAGGGAAGAAAACCAGAGGTTGAAGCAGCTATGGACGATGTTTTCACGGC 

TTATATGAATCTTGATAACATTGATGTCTTGAATTCTTTTGGAGGTGAAGATGGCAAGAA 

TGGGAATGAGAATGTGGAGGAGATGGAGAGTAGTAGAGGTAGTGGTACAAAGAAGACGAA 

TGGTGGAAGTAGTAGTGATTCTGAAGGAGATAGCAGTGCGAGTGGGAATGTGAAGGTTGC 

GTTGAGTTCTTCTTCTTCAGGCGTGAAGAGAAGAGCAGGTGGAGATATTGCTCCTACTGG 

TAGACATTACAGGAGTGTTTCTATGGAC^GTTGTTTCATGGGGAAGTTGAATTTCGGCGA 

CGAATCATCGCTAAAGCTTCCGCCTTCTTCATCAGCrrAAAGTTTCCCCAACCAATTCAGG 

TGAAGGGAATTCAAGTGCTTATAGTGTTGAATTTGGAAACAGTGAGTTTACTGCAGCTGA 

AATGAAGAAGATTGCAGCTGATGAGAAACTCGCTGAGATTGTAATGGCTGACCCTAAGCG 

TGTTAAAAGAATCTTGGCGAACCGCGTATCTGCTGCACGTTCAAAGGAGCGGAAGACGCG 

ATACATGGCAGAGTTGGAACACAAGGTGCAGACACTTCAGACTGAAGCTACTACATTATC 

GGCTCAGCTCACAGATTTGCAGAGAGATTCTATGGGGTTGACAAACCAGAACAGTGAGCT 

GAAGTTTCGTCTTCAAGCTATGGAGCAGCAAGCACAACTCCGCGATGCTCTGTCAGAGAA 

ACTGAATGAAGAAGTCCAGCGGTTGAAACTGGTGATAGGGGAGCCGAACCGCAGGCAAAG 

TGGGAGCAGCAGCAGCGAATCAAAGATGTCACTAAACCCGGAGATGTTTCAGCAGCTTAG 

CATAAGTCAGTTACAACACCAAC^GATGCAGCATTCC^TCAGTGTAGCACAATGAAAGC 

AAAGCACACTTCAAACGACTAGGGTAAGTAAAACTGCGATCCGCAGTTGTCTAGTTACAT 

ATATGATAAGAATCTTTTGTGCAGAGTTCTGTTTTTGGAAGTTTTAAAGT^ACATATATA 

AAGATTATGTCCGGGAAATTTGATCATATTTCCTGAAACATACAC^ 

TAATGGAGGACTTTCTTTCTGGACCA 

>G1050 Amino Acid Sequence (domain in AA coordinates: 372-425) 

MGGGGDTTDTNMMQRVNSSSGTSSSSIPKHNIjHLNPALIRSHHHFRHPFTGAPPPPIPPI 

SPYSQIPATLQPRHSRSMSQPSSFFSFDSLPPLNPSAPSVSVSVEEKTGAGFSPSLPPSP 

FTMCHSSSSRNAGDGENLPPRKSHRRSNSDVTFGFSSMMSQNQKSPPLSSLERSISGEDT 

SDWSNLVKKEPREGFYKGRKPEVEAAI^DVFTAYMNLDNIDVLNSFGGEDGKIJGNENVEE 

MESSRGSGTKKTNGGSSSDSEGDSSASG3WKVALSSSSSGVKRRAGGDIAPTGRHYRSVS 

MDSCFMGKLOTGDESSLKLPPSSSAKVSPTO^ 

EKLAEIVMADPKRVi^ILANRVSAARSKERKTRYMAELEHKVQTLQTEATT^ 
RDSMGLTNQNSELKFRLQAMEQQAQLRDALSEKLNEEVQRLKLVIGEPNRRQSGSSSSES 
KMSLNPEMFQQLSISQLQHQQMQHSNQCSTMKAKHTSND* 
>G1463 (199.. 1209) 

TATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGC 
TGACAAGCTGACTCTAGCAGATCTGGTACCGTCGACAGTTTGAGATTTGCTTCATCCGGT 
TTTTTTATTTTCTGCAAAATATGTCACTCTCTCCCATTTTGTTCATATATAATATGTTTG 
AAGTTTGATCAACTTAGTATGCGTTTCTTTTTCTCTCTAGTTCCTCTGTTTCTTGGTCGA 
TTTAGTTTCGTTATGGCGGACACACTGCTCAACGCAGAAGACGAAGTAATAATCTCACGT 
TATCTGAAGCCTATGATCGTTAACAGAGTATCATGGCCTGATCTCTTCATCGAAGACGCA 
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GACGTGTTCAACAAGGATCCATATGTGAAGTTCCATGCTGAGATCCCTAGCTTCGTGATC 

GTTAAACCACGAACAAAGGCTTGTGGTAAAACCGATGGATGTGATTCGGGTTGCTGGAGG 

ATCATTGGTCGTGATAAGCTGATAAAGTCGGAGGAGACTGGTAAGATTCTAGGGTTCAAG 

AAGATACTCAAGTTCTGCCTAAAGTGGAAACCTAGAGAATACAAGAGAAGTTTGGTAATG 

GAAGAGTATAGGCTTACCAATAACTTCAACTGGAAGCAAGATCATGTGATTTGCAAGA^ 

CGGCTTTTGTTTGAAGCAGAAATTAGTTTCTTGCTAGCCAAGCA 

GACTCACTTCCTCGAAATGTGCTGTTGCCAGC^ 

GAGGAGGACGAATTTTATCCGGTGACGATAATGATTTCAGAAGGAAAAGATTGGCCTAGC 

TACGTTACCAAC^CGTGTATTGTCTGC^TC<^TCGGAGCTTGTGAATGTTCACGATGGG 

AAGTTTCATGATAACGGAATCTGCATCTTCGCTAACAGGACTTGTGGTGTAACCGATAAA 

TGCAATGAAGGTTACTGGAAGATTAAGCACCGTGAGAAGCTGATCATGTCACGGTACGGG 

CAGACCATTGGTTGGAAGAAAGTTTTTCAGTTTTATGAAACGGAGAAAGAAAGACATTTT 

GGTAATGGAGAAGAAGTGAAGGTAACTTGGACTCTAAAAGAGTATAGGCTTACCAGAAAA 

ATGAACAAGAATAAAGTGGTGTGCGTTATCAAGTATAAGGTAAAGTGTTTACCGAGGATA 

ACTAGCTAGGGACTTCTACTCTTGGTTTCATGATCGATGCGACCGCTCTAGACAGGCCTC 

GTACCGGATCCTCTAGCTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCA 

>G1463 Amino Acid Sequence (conserved domain in AA coordinates : 9-156) 

^FFFSLWLFLGRFSFVMADTTjLNAEDEVIISR 

PYVKFHAEIPSFVIVlCPRTKACGKTDGaDSGC^ 

LKWKPREYKRSLVMEEYRLTNNFNWKQDHVICKIRL^ 

VXLPAYGFCSPDKQEEDEFYPWIMISEGKDWPSYVTNNWCI^^ 

ICIFANRTCGVTDKCOTGYWKIKHREKLIMSRYGQTIGWKKVFQFYETEKERHFGNG 

KVTWTLKEYRLTRKMNKNKVVCVIKYKTO 

>G1944 (236. .1306) 

TCGACCTTCCTAATTTCCAACCTCTGTTCT^ 

CTCAGTTTGATTTTCTTCTTCTAGCTCTTAAGTATATTTCTTTGTTGTTATTTATCTTTT 
AATCCTTTAATCTCATCTTTGTTTATCTTTAATCAAAACCCAAAATTTACATGGGTTCTT 
GAAAATCTAGAAGAAATAAAGGAAACATAACAAAAATAGAAAGAAAAAGAAGCTAATGGT 
CTTAAATATGGAGTCTACCGGAGAAGCTGTTAGATCAACCACCGGTAACGACGGTGGTAT 
TACGGTGGTTAGATCCGACGCGCCGTCAGATTTCCACGTAGCTCAAAGATCAGAAAGCTC 
AAACCAATCTCCCACCTCTGTCACTCCTCCTCCACCACAGCCATCGTCTCATCACACAGC 
TCCTCCGCCGCTGCAAATTTCGACGGTGACGACTACGACTACGACGGCCGCGATGGAAGG 
TATCTCCGGTGGACTGATGAAGAAGAAGCGTGGACGGCCAAGGAAGTATGGACCGGACGG 
GACTGTTGTAGCGTTATCTCCTAAACCGATTTCATCAGCGCCGGCGCCGTCGCATCTTCC 
GCCGCCGAGTTCACACGTCATCGATTTCTCCGCTTCTGAGAAACGTAGCAAAGTGAAACC 
AACGAACTCGTTTAACAGAACAAAGTATCATCACCAAGTTGAGAATTTGGGTGAATGGGC 
TCCTTGCTCCGTCGGTGGTAATTTCACACCTCATATAATCACAGTCAACACCGGCGAGGA 
TGTAACAATGAAGATAATCTCGTTTTCGCAACAAGGACCTCGCTCTATTTGTGTTCTGTC 
AGC7VAACGGTGTTATTTCAAGCGTTACACTTCGTCAGCCAGATTCCTCTGGCGGCACATT 
GACATACGAAGGTCGGTTTGAGATATTATCATTATCCGGGTCATTCATGCCTAATGATTC 
AGGCGGAACACGAAGTAGAACGGGAGGAATGAGTGTATCGTTAGCAAGTCCCGATGGACG 
TGTAGTAGGCGGTGGCCTCGCCGGTTTACTAGTAGCCGCGAGTCCGGTTCAGGTGGTTGT 
AGGAAGTTTTTTAGCGGGCACTGACCATCAAGATCAGAAACCGAAAAAGAACAAACATGA 
TTTCATGTTGTCGAGTCCTACCGCTGCAATTCCTATCTCTAGTGCAGCTGATCACCGGAC 
AATCCATTCGGTCTCGTCTCTTCCGGTCAATAATAATACATGGCAGACTTCTTTAGCTTC 
CGATCCAAGAAACAAGCATACCGATATTAATGTCAATGTAACTTGAAATCCAATCTTTCT 
CTGTATTTTCTGTTAAC^GTTTGATTTGGTTGTTTATCTACATTAGGATTTTACTAAAA 
TGGTAGTATTATTTATAGGGTTTTAGGGTCTTTATTTTGGTTCCACTGTTGTCACTTGTA 
GGATA 

>G1944 Amino Acid Sequence (domain in AA coordinates : 87-100) 
MVLNMESTGEAWSTTGNDGGITVVRSDAPSDFHVAQRSESSNQSPTSVTPPPPQPSSHH 
TAPPPLQISTVTTTTTTAAl^GISGGLMKKKRGRPRKYGPDGTWALSPKPISSAPAPSH 
LPPPSSHVIDFSASEKRSKVKPTOSFNRTKYHHQVENLGEWAPCSVGGNFTPH 
EDVTMKIISFSQQGPRSICVLSANGVISSVTLRQPDSSGGTLTYEGRFEILSLSGSFMPN 
DSGGTRSRTGGMSVSLASPDGRWGGGIAGLLVAASPVQVWGSFLAGTDHQDQKPKKNK 
HD FMLS S PTAAI PIS S AADHRT IHS VS SLP VNNNT WQTSLASDPRNKHTD INVNVT* 
>G2383 (37.. 990) 
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GACCTCTTTGATCCCTTCATTCCCCATCAM 

ATTCAAAGCCCTAATTCTCACCATCACTACTCTTCGCCTTCTTTTCCTTTCTCTT 

TTTCTTGAGAGTTTTGATGAATCCTTCTTGATAAACCAATTCTTGTTACAGCAGCAAGAT 

GTAGCAGCAAATGTTGTTGAATCTCCTTGGAAATTTTGCAAGAAGCTTGAGCTTAAGAAG 

AAGAATGAGAAGTGTGTTGATGGAAGCACCTCACAAGAGGTTCAATGGAGAAGGACGGTC 

AAAAAAAGGGACAGGCATAGTAAGATCTGCACGGCTCAAGGTCCTAGAGACCGGAGGATG 

AGGCTGTCTCTTCAGATTGCTCGCAAGTTTTTCGATC 

AAGGCGAGC^GACGATTGAATGGCTTTTCTCCAAATCAAAGACTTCCATCAA^ 

AAAGAAAGAGTGGCTGCATCGGAAGGAGGAGGAAAGGATGAACATCTCCAGGTTGATGAA 

AAGGAAAAGGATGAGACACTGAAGTTGAGAGTCTCAAAGAGAAGAACAAAGACTATGGAG 

AGCTCTTTTAAGACTAAAGAGTCGAGAGAGAGAGCTAGAAAGCGAGCAAGAGAGAGAACA 

ATGGCAAAGATGAAGATGAGATTATTTGAGACCTCGGAAACAATTTCAGATCCTCATCAA 

GAAACTAGAGAGATCAAGATAACCAATGGTGTACAATTACTAGAAAAGGAAAATAAAGAA 

C^GAATGGAGTAATACTAATGATGTTCACATGGTAGAGTATCAAATGGATTCTGTGAGC 

ATCATAGAGAAGTTTCTTGGACTT^CCAGTGACTCTAGCTCCTCTTCCATTTTTGGTGAC 

TCCGAGGAATGTTACACAAGTCTTAGTTCAGTAAGAGGTACAATTTCAGCAGCAGGTAAC 

AGCAATGTGTTAACTAAAAACCCTAATTGAGT^ 

TGGTAATTCCAGGAATGTCGACACCAAGGG 

>G2383 Amino Acid Sequence (conserved domain in AA coordinates : 89-149) 

MFPSFITHIQSPNSHHHYSSPSFPFSSDFLESFDESFLINQFLLQQQDVAANVVESPWKF 

CKJCLELKKXNEKCVDGSTSQEVQWRRTVKKRDRHSKICTAQG 

LQDMLGFDKASKTIEWLFSKSKTSIKQLKERVAASEGGGKDEHLQVDEKEKDETLKLRVS 
KRRTKTMESSFKTKESRERARKRARERTMAKMKMR^^ 

LLEKENKEQEWSNTNDVHMVEY QMDSVS I IEKFLGLTSDS SS S S I FGDSEECYTSLSS VR 
GT I S AAGtf SNVLTKNPN * 
>G571 (326.. 1708) 

TAGCCGACCTCTCTTCTCTCTTCTGAAAAAAACACCAAAGGAGCTTTAAATGCTCCGTTA 
CATAATCTCTATCTCTTTCCAAGAATATAGAGAAAGGAAAATAATATACAAGAATTAAAA 
GAAGGTATATCATCATCTCTCTAGCTAGTGATC^ 

ATCAGCTTGCCTCAGAGGAGAAGACCAACATAAGAGAGATCGAAGATCAAAATCTATCTC 
TCTTCATCATCTTCTGCTGTTACTATCATATCACACGCTCTCTCAAACATCATCCTATAT 
ATAGACTTCTCTTCATCATCATCAAATGC^ 

ATCATCATCCTCCGCCACGTCTTCCCATGGAAACTTC^TGAACAAAGATGGGTATGATAT 
TGGAGAGATAGACCCATCACTCTTCCTCTATCTTGATGGACAAGGACATCATGATCCTCC 
ATCAACTGCTCCTTCTCCTTTACATCATCATCACACAACTCAGAATTTGGCGATGAGACC 
TCCAACATCGACGCTCAACATCTTTCCATCTCAGCCTATGCACATAGAGCCACCTCCTTC 
TTCTACACACAATACCGATAATACAAGATTAGTTCCGGCTGCTCAACCTAGTGGTTCCAC 
TCGACCAGCTTCTGACCCGTCCATGGACTTGACCAAT(^TTCTCAGTTTCATC^\ACCTCC 
TCAAGGTTCTAAATCCATCAAGAAGGAAGGGAACCGCAAGGGTCTTGCCTCATCGGACCA 
TGACATACCTAAATCGTCAGACCCTAAAACATTGAGAAGACTAGCACAAAACAGAGAAGC 
AGCAAGAAAAAGCAGATTACGTAAAAAGGCTTATGTTCAGCAACTCGAGTCATGTAGGAT 
CAAACTGACCCAACTAGAACAAGAGATTCAACGGGCCAGATCCCAAGGCGTATTCTTTGG 
AGGGTCTCTTATAGGAGGAGATCAACAGCAAGGTGGACTACCCATTGGCCCTGGCAACAT 
CAGCTCTGAAGCAGCGGTGTTCGATATGGAATATGCGAGGTGGCTGGAGGAGCAGCAGAG 
GCTATTAAACGAACTAAGGGTGGCAACACAAGAACACTTGTCCGAGAACGAGCTTAGGAT 
GTTTGTGGACACATGTTTAGCTCATTATGACCATTTGATTAACCTCAAGGCTATGGTCGC 
TAAGACCGATGTCTTCCACCTCATTTCTGGAGCATGGAAAACTCCAGCTGAACGTTGCTT 
CTTGTGGATGGGTGQTTTCCGTCCATCGGAGATCATTAAGGTGATTGTGAACCAGATAGA 
ACC^TTGACGGAGCAACAGATAGTTGGGATATGTGGGCTGCAACAGTCCACACAAGAGGC 
CGAGGAGGCTCTCTCGCAAGGCCTCGAGGCGTTGAATCAATCACTTTCCGATAGCATTGT 
CTCTGACTCCCTCCCGCCTGCCTCCGCACCACTTCCTCCTCATCTATCCAATTTCATGTC 
ACACATGTCCTTAGCTCTCAACAAGCTCTCTGCTCTCGAGGGCTTCGTTCTCCAGGCGGA 
TAATTTGAGGCACCAAACGATCCATAGGCTGAACCAATTGTTGACGACCCGTCAAGAAGC 
ACGGTGTCTTCTAGCCGTTGCGGAGTACTTCCACCGTCTTCAAGCTCTAAGTTCTCTCTG 
GCTAGCCCGTCCTCGGCAAGATGGATAATACTAAAACAACTGATGAAGGAAACCAAAAAC 
AAAAACAAGAGAATAGGTTGATTAGTTAGCCGCCAGCTTGACCTCTTTATCATATATATC 
GTCTCTCTACTCAAATACAGTGCAATTAGGGAAAATTGTTTGGCTTCTTTTTGGTATATG 
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ATTCTTACTATTATGTTTTTAATCAAGA 

>G571 Amino Acid Sequence (domain in AA cordinates: 160-220) 
MQGHHQNHHQHLSSSSATSSHGNFMNKDGYDIGBIDPSLFLYLDGQGHHDPPSTAPSPLH 
HHHTTQNLAMRPPTSTLNIFPSQP^IEPPPSSTHNTO 
DLT1TOSQFHQPPQGSKSIKKEGNRKGIJ\SSDHDIPKSSDPKTL 

KAYVQQLESCRIKLTQLEQEIQRARSQGVFFGGSLIGGDQQQGGLPIGPGNISSEAAVFD 
ME YARWLEEQQRLLNELRVATQEHLS ENELRMF VDTCLAHYDHL INLKAMVAKTDVFHL I 
SGAWKTPAERCFLWMGGFRPSEIIKVIVNQIEPLTEQQIVGICGLQQSTQEAEEALSQGL 
EALNQSLSDSIVSDSLPPASAPLPPHLSNFMSHMSLALNKLSALEGFVLQADNLRHQTIH 
RLNQLLTTRQEARCLLAVAE YFHRLQAL S S LWLARPRQDG * 
>G636 (6.. 1814) 

CGATGATGCAACTGGGTGGTGGTACTCCGACCACTACAGCGGCGGCTACAACCGTCACAA 

CTGCTACAGCACCACCGCCACAATCAAACAACAACGATTCAGCGGCAACAGAAGCAGCGG 

CAGCAGCGGTTGGGGCGTTTGAGGTGTCGGAAGAGATGCACGACCGTGGGTTTGGAGGAA 

ATCGTTGGCCGCGGCAGGAAACGCTAGCGTTGTTGAAAATACGATCTGACATGGGAATAG 

CGTTTCGAGACGCTAGCGTTAAAGGTCCCTTATGGGAAGAGGTTTCTAGGAAAATGGCGG 

AGCATGGTTACATAAGAAACGCAAAGAAATGCAAAGAGAAATTCGAGAACGTTTACAAAT 

ACCACAAACGAACCAAAGAAGGTCGTACCGGAAAATCCGAAGGCAAAACTTATCGCTTCT 

TTGATCAATTAGAAGCTCTCGAGTCTCAATCTACAACCTCACTCCACCATCATCAACAAC 

AAACGCCTCTTCGACCACAGCAAAACAACAACAACAACAACAACAACAACAACAACAG 

CCATATTTTCAACTCCTCCTCCGGTAACGACAGTTATGCCGACGCTTCCTTCTTCATCAA 

TTCCTCCGTATACTCAGCAGATTAATGTACCTTCGTTTCCAAACATCTCCGGTGATTTTC 

TATCGGATAATTCTACATCGTCTTCGTCTTCTTATTCGACTTCTTCTGACATGGAGATGG 

GTGGTGGAACTGCGACTACAAGGAAGAAAAGGAAGAGGAAATGGAAGGTGTTTTTCGAGC 

GGTTGATGAAACAAGTAGTTGATAAACAGGAAGAGCTTCAACGCACATTCTTGGAAGCTG 

TTGAAAAGCGAGAACACAAGAGATTGGTTAGAGAAGAGTCTTGGAGAGTTCAAGAGATTG 

CCAGAATCAACCGCGAGCACGAGATCTTAGCTCAAGAACGCTCTATGTCCGCTGCAAAAG 

ACGCTGCTGTTATGGCCTTTCTTCAAAAACTGTCAGAGAAACAACCGAATCAGCCACAAC 

CGCAGCCTCAGCCGCAACAAGTTCGACCATCAATGCAGCTTJ^ 

AACCGCCTCAACGGTCTCCTCCACCGCAACCTCCTGCTCCGCTTCCGCAGCCAATTCAAG 

CGGTTGTGTCGACGTTAGACACAACGAAAACGCACAATCGTGGTGATCAGAATATGACTC 

CTGCAGCTTCAGCGAGCTCGTCGCGGTGGCCGAAAGTGGAGATAGAAGCATTGATAAAGC 

TGAGGACGAATCTTGATTCGAAATATCAAGAAAACGGACCAAAAGGACCATTGTGGGAAG 

AGATATCAGCGGGAATGAGAAGGTTAGGATTCAACAGGAACTCAAAGAGATGCAAAGAGA 

AATGGGAAAACATAAACAAATACTTCAAGAAAGTCAAAGAGAGCAACAAGAAACGTCCCG 

AAGATTCCAAGACTTGCCCTTACTTTCACCAGCTTGATGCTTTATATAGAGAGAGGAACA 

AATTCCACAGCAACAACAACATTGCAGCTTCTTCTTCATCTTCCGGTCTTGTTAAACCGG 

ATAATTCTGTTCCCTTGATGGTCCAACCAGAGCAGCAATGGCCTCCGGCTGTAACGACTG 

CGACAACTACTCCCGCAGCGGCTCAGCCTGATCAGCAATCTCAGCCGTCGGAGCAGAACT 

TTGATGATGAAGAAGGTACAGATGAAGAGTACGACGATGAAGATGAGGAAGAGGAGAATG 

AAGAAGAGGAAGGAGGTGAGTTCGAGCTTGTGCCTAGCAATAACAACAACAACAAGACGA 

CGAATAATCTGTAATGATGATGATTCGAGTTCGAACCGGTTTGGTGGTGAAAGATTAGTA 

ATCTTTTTTTAAGTTTTGATACAGAACATGAGAATTTAAATATTGGAGGGTTT 

>G636 Amino Acid Sequence (domain in AA coordinates: 55-145, 405-498) 

MQLGGGTPTTTAAATTVTTATAPPPQSNNNDSAATEAAAAAVGAFEVSEEMHDRGF 

WPRQETLALLKIRSDMGIAFRDASVKGPLWEEVSRKMAEHGYIRNAKKCKEKFENWKYH 

KRTKEGRTGKSEGKTYTCFFDQLEALESQSTTSLHHHQQQ 

FSTPPPVTTVMPTLPSSSIPPYTQQINVPSFPNISGDFLSDNSTSSSSSYSTSSDMEMGG 
GTATTRKKRKRKWKVF FERLMKQWDKQEELQRTFLEAVEKREHKRLVREES WRVQE IAR 
IlTOEHEILAQERSMSAAKDAAVl^FLQK^ 

PQRSPPPQPPAPLPQPIQAWSTLDTTKTHNRGDQNMTPAASASSSRWPKVEIEALIKLR 

TNLDSKyQENGPKGPLWEEISAGMRRLGFNRNSKRCKEKWENINKYFKKVK^ 

SKTCPYFHQLDALYRERNKFHSNl^IAASSSSSGLVKPDNSVPLMVQPEQQWPPAVTTAT 

TTPAAAQPDQQSQPSEQNFDDEEGTDEEYDDEDEEEENEEEEGGEFELVPSNNNl^KTTN 

NL* 

>G878 (197.. 1738) 

CAAAAAAAATCTCTCCCATTAAAAGACTGCCCAAAGAAATATTTTATACAAAATGAAAGA 
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GAGAAAC^CGACACGAATTTTGTATAATTAAGATTACACAAAAAAAAGTGTTAGAAAGAG 

AAATATCTTCTTCTTTTTTCTGTGTGAGTTGGGTTTGTTAAAGTTTTATCCTTTTTG 

TCAAAATCAAGAATCGATGGCGGAGAAGGAAGAAAAAGAACCATCGAAGTTAAAATCATC 

CACCGGAGTTTCACGGCCAACGATTTCACTACCTCCTCGACCGTTTGGTGAAATGTTTTT 

TAGCGGTGGCGTTGGATTTAGTCCTGGACCAATGACTCTCGTCTCAAATTTATTCTCTGA 

TCCTGATGAGTTCAAGTCTTTCTCTCAGCTTTTAGCTGGAGCTATGGCTTCTCCGGCGGC 

AGCTGCTGTTGCCGCCGCTGCTGTGGTTGCTACTGCTCATCATCAGACACCTGTGAGCTC 

TGTCGGTGATGGCGGTGGAAGCGGTGGTGATGTTGACCCGAGGTTTAAGCAGAGTAGACC 

AACGGGATTGATGATAACTCAACCACCGGGGATGTTTACTGTACCGCCGGGGTTAAGTCC 

GGCTACTCTTrTGGATTCTCCGAGCTTC 

TGGTATGACACATCAACAAGCTTTAGCACAAGTCACTGCACAAGC^ 
TGTTCATATGCAGCAATCACAACAATCTG 

ACAACAACAACAAGOTTCATTGACTGAGATTCCATCATTTTCTTCTGCACCTAGGTCT 

GATTCGAGCCTCGGTTCAAGAAACATCGCAGGGTCAGAGAGAGACTTCGGAAATATCTGT 

CITTGAGCATCGGTCACAGCCTCAAAATGCTGACAAACCAGCTGATGATGG 

GCGGAAATATGGGCAGAAGCT^GTGAAGGGGAGCGATTTTCCTCGGAGTTATTACAAATG 

TACGCATCCAGCTTGTCCTGTCAAGAAGAAAGTGGAGAGGTCACTCX3ATGGACAAGTAAC 

GG AAATCATCTACAAGGGTCAACACAATCATGAGCTTCCTCAAAAGCGCGGTAACAATAA 

CGGGAGTTGTAAAAGTTCTGATATTGCAAATCAGTTTCAAACAAGTAATAGCAGTCTCAA 

CAAGAGTAAGAGGGACCAGGAMCAAGCCAAGTTAGAACAACAGAGCAGATGTCTGAAGC 

AAGTGATAGCGAGGAGGTTGGGAATGCAGAGACTAGTGTGGGAGAAAGACATGAGGATGA 

GCCTGATCCCAAGCGAAGAAATACAGAAGTTCGGGTTTCAGAACCAGTTGCTTCATCGCA 

TAGAACTGTGACAGAGCCTAGGATTATTGTCCAAACGACGAGTGAAGTTGACCTCTTAGA 

TGATGGATATAGGTGGCGCAAGTATGGTCAGAAAGTAGTCAAAGGAAATCCTTATCCGAG 

GAGCTACTATAAGTGTACAACACCAGATTGCGGAGTAAGGAT^ACATGTAGAGAGAGCAGC 

AACTGACCCAAAAGCTGTTGTAACAACATATGAAGGTAAACATAACCATGATGTTCCAGC 

TGCTAGAACCAGCAGCCATCAGTTAAGACCAAACAAT 

CTTCAATCATCAACAGCCTGTTGCACGTTTAAGGCTTAAAGAAGAGCAAATCACTTGACA 
GAGAAGAAGAATACGACGGCGCTTGAGCTTTTGTGAGTTTAATGAATCTTCTTTTTGGTT 
AATGAACCTGTTTTTGTTGCCTCAAAACACCACAGGTTTCTCTGGACAGAATCTCTGATA 
TTACAGTTTCAAAAGGTATGTTCTTTTATTTCATGTTGGAATCTTCTGTGTAATCTTAAG 

aagctttaggaggtaatgtaaaaaaccagattcaaagttatgcccttatgtgaattcttt 
tgtacatgggataaacaaaatttacaggtatcctttttg 

AAAA 

>G878 Amino Acid Sequence (domain in AA coordinates : 250-305, 415-475) 

MAEKEEKEPSKLKSSTGVSRPTISLPPRPFGEMFFSGGVGFSPGPMTLVSNLFSDPDEFK 

SFSQLLAGAMAS PAAAAVAAAAWATAHHQTPVS SVGDGGGSGGDVDPRFKQSRPTGLMI 

TQPPGMFTVPPGLSPATLLDSPSFFGLFSPLQGTFGMTHQQALAQVTAQAVQGNNVHMQQ 

SQQSEYPSSTQQQQQQQQQASLTEIPSFSSAPRSQIRASVQETSQGQRETSEISVFEHRS 

QPQNADKPADDGYNWRKYGQKQVKGSD FPRS YYKCTHPAC PVKKKVERSLDGQVTE 1 1 YK 

GQHNHELPQKRGl^GSCKSSDIANQFQTSNSSLNK^ 

VGNAETSVGERHEDEPDPKRRNTEVRVSEPVASSHRTVTEPRIIVQTTSEVDLLDDGYRW 
RKYGQKWKGNPYPRSYYKCTTPDCGTOKHVERAATD^ 
HQLRPNNQHNTSTVNFNHQQPVARLRLKEEQIT* 
>G1134 (61.. 849) 

TAAAGAAAGAGAAAAAAAGCTTTCGTAGTGTCTATTGAAACCAGAGAAAAGCCAAAGGGG 
ATGCAACCAACATCCGTCGGTAGTAGCGGCGGTGGTGACGACGGAGGAGGCAGAGGAGGA 
GGAGGAGGGCTAAGTAGAAGTGGACTATCTCGGATCCGTTCAGCTCCAGCGACTTGGCTT 
GAAGCTTTACTTGAGGAAGATGAAGAAGAGTCTTTGAAACCTAATCTTGGTCTCACCGAT 
TTGCTTACCGGGAACTCGAACGATTTACCGACAAGTCGCGGCTCGTTCGAGTTCCCGATT 
CCTGTTGAGCAAGGGTTGTATCAACAAGGTGGGTTTCACCGACAGAATAGTACTCCGGCG 
GATTTTCTTAGTGGTTCTGATGGATTTATCCAAAGCTTTGGGATTCAGGCGAATTACGAT 
TACTTATCGGGGAATATCGATGTTTCTCCGGGAAGTAAGCGGTCTAGAGAAATGGAAGCA 
CTCTTCTCTTCn'CCTGAGTTTACTTCTCAAATGAAAGGAGAGCAAAGCAGCGGTCAAGTT 
CCTACCGGAGTATCAAGCATGTCGGATATGAACATGGAGAACCTTATGGAGGACTCTGTT 
GCTTTTAGGGTTCGGGCTAAACGTGGTTGCGCAACTCATCCCCGCAGCATTGCCGAGAGG 
GTACGAAGGACGCGGATTAGTGATCGGATAAGGAAGCTACAAGAGCTTGTACCTAACATG 
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GACAAGCAAACCAACACTGCAGACATGTTAGAAGAAGCAGTAGAATACGTGAAAGTTCTT 

CAAAGGCAGATCCAGGAGTTAACAGAAGAACAGAAGAGGTGCACATGCATACCTT^AGGAA 

GAACAATAAGGTTTGCTCCTGATTTGTTTTATATTTGCTTAACGGCAATGATCTGATCGA 

AAAATTCGAAAGATGATCTTAGCTTGAATTTAGATGGATGTCATGTTGAAAAGTATA 

TTTGATAAATGGATGTAGGTGTAATATAAAATTTTTGTACAATAATGAAGAAAGTTAAAA 

AGAATTAATGAAAACATATATTCTTTATGATATAAAAAAAAAAA 

>G1134 Amino Acid Sequence (domain in AA coordinates: 198-247) 

MQPTSVGSSGGGDDGGGRGGGGGLSRSGLSRIRSAPATWLEALLEEDEEESLKPNLGLTD 

LLTGNSNDLPTSRGSFEFPIPVEQGLYQQGGFHRQNSTPADFLSGSDGFIQSFGIQANYD 

YLSGNIDVSPGSKRSREMEALFSSPEFTSQMKGEQSSGQVPTGVSSMSDMNMENLMEDSV 

AFRVRAKRGCATHPRS I AERVRRTRI SDRIRKLQELVPNMDKQTNTADMLEEAVEYVKVL 

QRQIQELTEEQKRCTCIPKEEQ* 

>G1008 (89.. 973) 

GCCTTTTTGACTCTTCTTTCTCTCTTCTACTTTTTTTCAGGCTCTCTCT 

TCTTCTTCTCCGGTTAACTAAAAGAGAAATGAAAAGCCGAGTGAGAAAATCCAAGTACAC 

GGTTC^CCGGAAAATCACATCCACACCGTTCGACGGTTTCCCGAAGATTGTCAAAATCAT 

AGTCACTGACCCATGCGCTACTGATTCTTCCAGCGATGAGGAAAACGACAACAAATCTGT 

TGCTCCGAGGGTGAAACGTTATGTGGATGAGATCAGGTTCTGTGACGAAGATGACGAACC 

TAAACCGGCGAGGAAAGCGAAGAAAAAGTCCCCGGCGGCTGCGGCGGAGAACGGTGGAGA 

TTTGGTAAAGTCTGTGGTGAAGTATAGAGGAGTGAGACAACGACCTTGGGGAAAATTTGC 

GGCGGAGATTCGTGATCCTTCGAGTCGTACTAGACTCTGGCTTGGGACTTTTGCGACGGC 

GGAGGAAGCTGCTATAGGTTACGATAGAGCCGCGATTCGAATCAAAGGTCATAACGCTCA 

GACGAATTTTCTCACTCCTCCTCCTAGTCCGACGACTGAGGTGTTACCGGAAACTCCGGT 

GATTGACCTTGAAACTGTCTCTGGTTGTGATTCGGCGAGGGAATCGCAAATCAGTCTGTG 

TTCTCCGACTTCTGTTCTCCGGTTTAGTCACAACGACGAAACAGAGTACAGAACAGAGCC 

AACGGAAGAACAAAATCCGTTTTTCTTGCCTGATTTGTTTCGCTCCGGAGATTATTTTTG 

GGATTCCGAAATTACCCCTGACCCTTTGTTTCT 

AAACATCAACAACAACAACACAGTGTGTGATAAGGATACGAATCTGTCTGATAGTTTTCC 
GTTGGGAGTGATCGGAGATTTCAGCTCATGGGATGTTGATGAGTTTTTCCAAGATCATTT 
GTTGGATAAGTAATTTGATGAGTTCTTCCCCAGAATTTTTCTGGGTTTCTCTTTTTGGTT 
GTGTGAGTGAGATGAGTGGTTTGATGACAACGACGGGGATGAATCTTAGCCGTCCGTTTT 
CCATTTCGTGGACGGCTCCGATCAGCGGAAGAAGCGCAACGGAGTTTTTATTTATCTGTT 
TGAGAATTTTATAATTTAATTTGCGAGTAAATATAGTAATTAGTGTTAAGATTGTGAGAG 
TTTAAGTTAATTAGGGAGGGGTTTTGAATATTGGGGATTTTGGGAGGTTTTTGTTTGGTT 
TCTCTCCAAGTCTGTCACTATGCAAGGAAGCAGTATAAAGACCGTATATATATTTTATTA 
TTAATATTGATAAAAGTAAAAAAAAAAAAAAAAA 

>G1008 Amino Acid Sequence (domain in AA coordinates: 96-163) 
MKSRVRKSKYTVHRKITSTPFDGFPKIVKI IVTDPCATDSSSDEENDNKSVAPRVKRYVD 
E I RFCDEDDEPKPARKAKKKS PAAAAENGGDLVKS WKYRGVRQRP WGKFAAE IRDPS SR 
TRLWLGTFATAEEAAIGYDRA7VIRIKGHNAQTNFLTPPPSPTTEVLPETPVIDLETVSGC 
DSARESQISLCSPTSVLRFSHNDETEYRTEPTEEQNPFFLPDLFRSGDYFWDSEITPDPL 
FLDEFHQSLLPNINMNNTVCDKDTNLSDS FPLGVI GDFSS WDVDEFFQDHLLDK* 
>G1020 (132.. 689) 

CTGTTCACAAGAAAGCTCCCCAAAAGGAGCGTTGCTTTACTCTCCTATAAAAAGAAGCTC 
TTCTACTTOTTCTCGTTACC^CAAAACTCTTTC^CCGATCTTCTCGTTCCATTCTTCTTC 
CTAATTACACCATGCCCAACATCACCATGGGTTTGAAACCCGACCCGGTTGCTCCAACGA 
ACCCGACTCATCATGAGAGTAATGCTGCCAAAGAGATTCGTTACAGAGGCGTTAGGAAAC 
GTCCATGGGGAAGATACGCCGCTGAGATCCGAGATCCGGTTAAGAAAACTCGAGTCTGGC 
TCGGTACGTTCGACACCGCTCAGCAGGCGGCGCGTGCTTACGACGCAGCCGCGCGTGACT 
TTCGTGGTGTTAAGGeTAAGACCAATTTCGGTGTTATCGTTGGTAGTAGTCCTACTCAGA 
GTAGCACCGTCGTCGACTCTCCCACGGCGGCACGGTTTATAACACCTCCGCACCTCGAGC 
TCAGCTTAGGCGGCGGCGGCGCGTGTCGTCGTAAGATCCCGCTTGTGCATCCGGTTTACT 
ACTATAACATGGCGACGTATCCAAAGATGACGACGTGTGGTGTCCAGAGCGAGTCTGAAA 
CGTCGTCGGTCGTTGATTTCGAAGGTGGAGCTGGGAAGATATCTCCGCCGTTAGATCTGG 
ATCTTAACTTAGCTCCTCCGGCGGAATAGGCCGTGAGTTTTTTTTTTCTTATGTCGTTTC 
TTTAGACAAAAAAAAATAACGTTTCCTTTTTTTTTCTGCCTAAGAAAAAAATATTATCCG 
TTTTTTAGAAGAAAAAAAAAAAAAAAAAAAAAA 



153 



WO 03/013227 



154/286 



PCT/US02/25805 



>G1020 Amino Acid Sequence (domain in AA coordinates : 28-95) 
MPNI TMGLKPD PVAPTNPTHHBSNAAKE IR YRGVRKRPWGRYAAEIRDPVKKTRVWLGTF 
DTAQQAARAYDAAARDPRGVKAKTNFGVIVGSSPTQSSTVVDSPTAARFITPPHLEIjSLG 
GGGACRRKIPLVHPVYYYNMATYPKMTTCGVQSESETSSVVDFEGGAGKISPPLDLDIiNL 
APPAE* 

>G1023 (252.. 1250) 

TCGTCTTCTTAATCGCTTTCTGCTCTGTTTTTCTCGTTCATCAAGCTACATCTACTAGCT 

CTCTCAGTGATTGATTTCTCACAGTTTCATCGATTTCCATC 

CTTGTTCTGGGGTAAAGGACTTTTCTTGTTCTTC 

GGAATTTTGAGAGGTTTTTTAGGGTTTAAGGGGGTTTGGTTTTGAATTT 

TGTTCGATAAAATGGCTGAACGAAAGAAACGCTCTTCTATTCAAACCAATAAACCCAACA 

AAAAACCCATGAAGAAGAAACCTTTTCAGCTAAATCACCTCCGAGGTTTATCTGAAGATT 

TGAAGACTATGAGAAAACTCCGTTTCGTTGTGAATGATCCTTACGCTACTGACTACTCAT 

CAAGCGAAGAAGAAGAAAGGAGTCAGAGAAGGAAACGTTATGTCTGTGAGATCGATCTTC 

CTTTCGCTCAAGCTGCTACTCAAGCAGAATCTGAAAGCTCATATTGTCAGGAGAGTAACA 

ATAATGGTGTAAGCT^GACTAAAATCTCAGCTTGTAGCAAAAAGGTTTTACGCAGCAAAG 

CATCTCCGGTCGTTGGACGTTCT^CTACTACTGTCTCGAAGCCTGTTGGTGTTAGGCAGA 

GGAAATGGGGTAAATGGGCTGCTGAGATTAGACATCCAATCACCAAAGTAAGAACTTGGT 

TGGGTACTTACGAGACGCTTGAACTVAGCAGCTGATGCTTATGCTACC^GAAGCTTGAGT 

TTGATGCTCTGGCTGC^GCC^CTTCTGCTGCTTCCTCTGTTTTGTCAAATGAGTCTGG^ 

CTATGATCTCAGCCTCAGGGTC^GCATTGATCTTGACAAGAAGCTAGTTGATTCGACTC 

TTGATCAACAAGCTGGTGAATCGAAGAAAGCGAGTTTTGATTTCGACTTTGCAGATCTAC 

AGATTCCTGAAATGGGTTGCTTCATTGATGACTCATTCATCCCAAATGCTTGTGA 

ATTTTCTCTTAACAGAAGAGAACAAGAACCAAA 

ATCTGGAC7\.TGATTGGTCTTGAATGTGACGGTCCAAGCGAACTTCCAGACTATGATTTCT 
CAGATGTGGAGATCGATCTTGGTCTC^TTGGAACCACCATTGACAAGTATGCTTTCGTTG 
ATCATATCGCAACAACTACTCCCACTCCTCTTAATATCGCGTGCCCATAAGTTTTGCAGC 
TAGGTGTTATTATTAGCTATAGGAGCAACGTAAAAAGCTCGTTGTTACTCGGTTTTGTCT 
TAAGTTATTAAAGTATAGCAGAGGCAGTTAATCTCAAGGGAAGCAAAAACCCTAAAGATA 
GAAGC^GATGCAGTTTTGTGTGTTGGTGTTACTAAAGAAAGTTTTGTTGACATAATGGTT 
TTGATGTTGTGGAGAAGATAGAGAGGTGTGATCGAAATTGTAAATCTCAGGTGGXTTTTT 
TTGAAGGCAATTGTTTCTCATTTAGGGTTTTTTTCTATATGAGGATTGTCTTTGAAAAGC 
CTTTAGATGTTTTCTAATTCGTAAGCTCTCTCAATCTTTGTAAGTTTTGCCTGTTGAGTT 
ATTGATAC^TATGTGAGACCTACTTTATTTGTTTTGTGCTACATACATTGTTGATGGTTT 
CGTCAAAAAAAAA 

>G1023 Amino Acid Sequence (conserved domain in AA coordinates : 128-195) 
MAERKKKSSIQTNKPNKKPMKKiCPFQLNHLPGLSEDLKTM 

EERSQRRKRYVCEIDLPFAQAATQAESESSYCQESNNNGVSKTKISACSKKVLRSKASPV 

VGRS STTVSKPVGVRQRKWGKWAAE I RHPITKVRTWLGTYETLEQAADAYATKKLEFDAL 

AAATSAASSVLSNESGSMISASGSSIDLDKKLVTDSTLDQQAGESKKASFDFDFADLQIPE 

MGCFIDDSFIPNACELDFLLTEENNNQMLDDYCGIDDIjDIIGLECDGPSELPDYDFSDVE 

IDLGLIGTTIDKYAFVDHIATTTPTPLNIACP* 

>G1053 (38.. 538) 

GAAACTCTTACATACTCATATAAACCAAACTAAAACCATGATTCCGGCAGAAATCAACGG 

ATATTTCCAATATCTATCACCGGAATACAACGTAATAAACATGCCTTCATCTCCAACCTC 

TTCCTTAAACTACCTAAACGATTTGATCATCAACAACAACAACTATTCCTCATCATCCAA 

CAGTCT^AGATCTCATGATAAGCAACAACTCAACTTCCGACGAAGATCATCATCAAAGCAT 

CATGGTACTCGACGAGAGGAAACAGAGAAGGATGCTTTCGAACAGAGAATCTGCAAGGAG 

GTCAAGGATGAGGAAACAGAGACATCTTGATGAACTCTGGTCTCAGGTAATAAGGCTTCG 

CAACGAGAACAACTGTCTTATCGATAAGCTGAACCGCGTATCGGAGACTCAAAATTGTGT 

ATTGAAGGAGAACTCTAAACTCAAAGAAGAAGCTTCTGATCTCCGACAGCTTGTTTGTGA 

ACTGAAATCTAACAAGAAC^CAACAATAGTTTTCCAAGAGAGTTTGAAGATAATTAG 

TTACTCAAA 

>G1053 Amino Acid Sequence (domain in AA coordinates: 74-120) 
MIPAEINGYFQYLSPEYIWII^PSSPTSSLNYLND^ 

DEDHHQSIMVljDERKQRRMLSNRESARRSRMRKQRHLDELWSQVIRLRNENNCLIDKL^ 
VSETQNCVLKENSKLKEEASDLRQLVCELKSNKNNNNSFPREFEDN* 
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>G1137 (202.. 1248) 

TACTTCAGACTTCTACTCAAACCAGTCACGTAGTTGGTTGGTGACATTTCGCTGCATTTT 
TCAATCTGTGATTGTTTTTCGTTCGTCm 

AAGTATTGCATTCACTCAGTTGAGCAACTTAACAATCGTGTTGTACTTTTTGAAGTTCCC 
TTGAGCTAAACTGCTAAGAGCATGCCTCTGGATAAGAGGCAACGGGATTTGCCTCTGGGC 
TTAAGTCCTCAAGCTTGCTTCAAGGATATAGTAGGTCGGTCTGTCCTTCCTAGAATTCCT 
CTCCCTGAGCTTGGGAAACTATATGCAGCTAAGCTTCAGGCTC 

CCATTCCAGTCTTTGCTGTGCAGTCATGATAAGGAGTCTTATGGAAAAAGATTCTCACGG 
TCTGACATGCGGTCTTGGTGCGCTGCTGCTACTACTACTACTACTCCACTTGGAGCATTA 
GAGTCTTCTCAGAAAAGACTTTTGATATTCGATCAGTCAGGAGACCAGACTCGTCTATTA 
CAATGTCCATTTCCTCTACGGTTTCCATCTCATGCGGCTGCAGAACCAGTGAAACTCTCT 
GAGTTACAAGGTATAGAGAAAGCTTTCAAAGAAGATGGTGAAGAGTTTCACAAGAGTGAT 
GGAACAGAGTCAGAAATGCATGAAGACACTGAGGAGATCAATGCATTGCTATATTCAGAT 
GATGATTATGATGATGATTGCGAGAGTGATGATGAAGTAATGAGCACTGGTCACTCTCCT 
TATCCAAATGAAGGAGTTTGCAACAAAAGGGAATTAGAAGAAATCGATGGTCCTTGTAAA 
AGGCAGAAACTACTGGATAAGGTCAACAACATCAGCGACTTATCATCACTTGTGGGCACT 
GAGAGCTCCACACAACTCAATGGATCTTCCTTTCTT^ 

AAAACCATATCGACCAAAGAGGACACTGGTTCTGGTCTGAGCAACGAGCAGTCGAAGAAA 

GACAAGATCCGCACAGCTCTGAAAATACTCGAGAGCGTAGTCCCTGGTGCAAAAGGAAAC 

GAAGCGCTCTTACTTCTGGACGAAGCAATTGATTACCTAAAGTTGCTGAAACGAGACTTA 

ATCTCCACAGAGGTTAAGAACCAAAGCTCCACCACTCACAAGTCACCAATCTTGTTGCTT 

AAAGAGACAACATGGGGAACAAGAAATCTGCAGACAGATAAGGCGTGAAAGATTCTGACG 

AGTTAAAACGTGTGAAGTGGGTTTTTGGGTACGTATCCTTGCACCAGCTTT 

>G1137 Amino Acid Sequence (domain in AA coordinates : 264-314) 

MPLDKRQRDLPLGLSPQACFKDIVGRSVLPRIPLPELGKLYAAKLQARCLQPPPFQSLLC 

SHDKESYGKRFSRSDI^SWCAAATTTTTPLGALESSQK3UjLIFDQSGDQTRLIjQCPFP 

FPSHAAAEPVKLSELQGIEKAFKEDGEEFHKSDGTESEMHEDTEEINALLYSDDDYDDDC 

ESDDEVMSTGHSPYPNEGVCNKRELEEIDGPCKRQKLLDKVNNISDLSSLVGTESSTQLN 

GSSFLKDKKLPESKTISTKEDTGSGLSNEQSKKDKIRTALKILESVVPGAKGNEALLLLD 

EAIDYLKLLKRDL I STE VKNQS STTHKS P ILLLKETTWGTRNLQTDKA* 

>G1181 (113.. 1012) 

CTCGATCTTTTAACCCCCATTATTACATATTACTCCTTCCTACATTATTCTTCTTCTGCT 
TTCGTGACTTTCAGGGGACACTTTTGTTTTTATAACTTACGCTTAAAATCCTATGAATTC 
GCCGCCGGTTGACGC^ATGATTACCGGAGAATCATCGTCACAAAGATCTATCCCAACGCC 
GTTTCTCACAAAAACGTTTAACCTCGTTGAAGATAGTTCCATCGACGATGTTATCTCATG 
GAACGAAGATGGTTCCTCTTTCATCGTATGGAATCCGACAGATTTCGCTAAAGATTTGCT 
TCCTAAACACTTCAAACACAACAATTTCTCTAGTTTCGTTCGTCAGCTCAACACTTACGG 
ATTCAAAAAAGTTGTACCGGATCGATGGGAGTTTTCAAACGATTTCTTTAAGAGAGGAGA 
AAAACGTCTTCTCCGTGAGATCCAACGTCGGAAAATAACAACGACGCATCAAACAGTTGT 
TGCTCCTTCGTCGGAACAACGAAACCAGACGATGGTTGTATCACCGTCAAATTCCGGGGA 
AGATAATAATAATAATCAGGTGATGTCTTCGTCTCCGTCGTCGTGGTATTGTCATCAAAC 
GAAGACGACTGGGAATGGTGGTTTATCAGTGGAGTTATTGGAAGAGAACGAGAAGCTTCG 
GAGTCAAAACATTCAGCTAAACCGTGAGCTTACTCAGATGAAATCTATCTGCGATAATAT 
CTATAGTCTCATGTCGAATTACGTCGGATCTCAGCCCACTGATCGGAGTTATTCTCCCGG 
AGGTAGTAGTAGTCAACCGATGGAGTTTTTACCGGCGAAGCGGTTTTCGGAGATGGAGAT 
TGAAG^AGAAGAAGAAGCGAGTCCGAGGTTGTTTGGTGTTCCGATTGGGTTAAAACGGAC 
GAGAAGTGAAGGTGTTCAGGTGAAGACGACGGCGGTGGTTGGGGAAAATTCCGATGAGGA 
GACGCCGTGGTTGAeACATTATAATCGAACCAATCAGAGAGTTTGTAATTAAAAACGAAC 
GGTTTAGATTTGTGGTGTAGATATGTGCGCGAAGTAGACGATTACAGCTTTTTAAGACAA 
GCAGAGCACGTGTCCCATCTGTTTCAAGAAGTTTCTGCAATCTTGACTTCTTCTTTTAAC 
ACTTTGTGTTTTTTATTATTTAATTAATAACAATAAATGTTCTTTTTCAGTTTTGTTTTC 
TTCAAAAATAGTTCGGCTGTTTCTAGACTTTCCTTTTTT 

>G1181 Amino Acid Sequence (domain in AA coordinates: 24-114) 
MNS PPVDAM I TGE S S SQRS I PTPFLTKTFNL VEDS S IDDVI S WNEDGS SF I VWNPTDFAK 
DLLPKHFKHNNFSSFTOQLNTYGFKKVVPra^ 

TWAPS SEQRNQTMWS PSNSGEDNNNNQ VMS S S P S S WYCHQTKTTGNGGLS VELLEENE 
KLRSQNIQLNRELTQMKSICDNIYSLMSNYVGSQPTDRSYSPGGSSSQPMEFLPAKRFSE 
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MEIEEEEEASPRLFGVPIGLKRTRSEGVQVKTTAVVGENSDEBTPWLRHYNRTNQRVCN* 
>G1228 (63.. 1139) 

GCATTTATAATTACTCACTCATCTTCTTTTCATTACATTACATACCAAA 

AAATGGAAAGGTTTCAAGGACACATCAACCCCTGTTTCTTCGATCGAAAACCGGATGTGA 

GAAG CC T CGAGGTTCAAGGATTTGCAGAGG CTCAAAGCTTTGCTTTC AAAGAAAAAGAGG 

AAGAAAGCTTACAAGATACAGTTCCATTTCTACAGATGCTGCAAAGTGAAGACCCCTCAT 

CG' lU^ l Tl ^CAATCAAAGAGCCAAACTTTCTGACGCTACTGTCTCTTCAAACCCTCAAGG 

AGCCTTGGGAACTCGAAAGATATCTTTCACTTGAGGATTCACAATTTCATTCACCGGTCC 

AATCTGAGACCAACCGCTTCATGGAAGGAGCCAATCAAGCTGTGTCAAGCCAAGAAATTC • 

CCTTTAGCCAAGCAAACATGACACTCCCTTCTTCTACCTCATCACCACTCAGTGCACATT 

CAAGACGAAAGCGCAAAATCAACCACTTGCTGCCTC71AGAAATGACTAGAGAAAAGAGAA 

AGAGGAGGAAAACAAAACCAAGTAAAAACAATGAAGAGATTGAGAATCAAAGAATAAACC 

ACATTGCTGTTGAACGAAACAGAAGACGTCAAATGAACGAACATATCAACTCTCTCCGGG 

CCCTTCTCCCACCTTCCTACATCCAACGAGGAGACCAAGCTTCGATAGTAGGAGGAGCAA 

TAAACTACGTGAAGGTCCTCGAGCAAATCATACAATCTCTCGAATCGCAAAAGAGAACGC 

AAC^C^AAGTAACAGTGAGGTAGTAGAAAACGCACTTAATCATCTCTCAGGCATTTCGT 

CGAACGACCTGTGGACAACTCTTGAAGATCAAACTTGTATCCCCAAAATCGAAGCTACAG 

TGATACAAAACCATGTCAGCCTTAAAGTTCAATGTGAGAAGAAACAAGGACAACTTCT 

AAGGAATCATATCACTTGAAAAGCTTAAACTCACTGTTCTTCATCTCAATATCACT^ 

CGTCTCATTCCTCTGTTTCTTATTCCTTCAACCTCAAGATGGAAGATGAGTGCGACTTAG 

AGTCAGCCGACGAGATTACGGCGGCTGTTCATCGGATTTTCGATATTCCGACAATTTGAT 

TAAACACATATAATTCCAAAAATATTAACAGCTGACAAAATGGTATCTTTGCGGCC 

>G1228 Amino Acid Sequence (domain in AA coordinates: 179-233) 

MERFQGHINPCFFDRKPDVRSLEVQGFAEAQSFAFKEKEEESLQDTVPFLQMLQSEDPSS 

FFSIKEPNFLTLLSLQTLKEPWELERYLSLEDSQFHSPVQSETNRFMEGANQAVSSQEIP 

FSQANMTIiPSSTSSPLSAHSRRKRKINHLLPQEMTREKRKRRKTKPSKNNEEIENQRIim 

IAVERNRRRQMNEHINSLRALLPPSYIQRGDQASIVGGAINYVKVLEQIIQSLESQKRTQ 

QQSNSEVVENALNHLSGISS1TOLWTTLEDQTCIPKIEATVIQNHVSLKYQCEKKQGQLLK 

GIISLEKLKLTVLHLNITTSSHSSVSYSFNLKMEDECDLESADEITAAVHRIFDIPTI* 

>G1277 (51.. 512) 

ATTCTAAAGTCCTCCTCTCGGAAAGTAAGAGACTCAACTTCCGAGCCGCCATGGACGCCG 
GAGTAGCAGTAAAAGCTGACGTGGCAGTCAAAATGAAGAGAGAAAGACCATTCAAAGGGA 
TCAGAATGAGAAAATGGGGGAAATGGGTTGCGGAGATTCGAGAACCCAACAAGCGTTCAA 
GACTTTGGCTCGGCTCTTACTCTACTCCCGAAGCGGCGGCGCGTGCATACGACACGGCTG 
TCTTTTACCTCAGAGGACCAACTGCTACGCTCAACTTCCCGGAGCTTCTGCCGTGTACCT 
CCGCCGAGGATATGTCAGCGGCAACGATCAGGAAAAAGGCGACGGAGGTGGGAGCTCAAG 
TAGATGCGATAGGGGCGACGGTGGTGCAGAACAACAAACGCCGCCGCGTTTTTAGTCAAA 
AGCGTGACTTTGGCGGCGGGTTATTAGAGCTTGTTGACTTGAACAAGTTACCTGACCCGG 
AAAATCTCGATGATGATTTGGTGGGAAAATAGACTGAAAAATAATAATAAAATATCTTAC 
AATGGTGGCTGTAGCTATCGTACGCGGAATGCTTGGGCTTGTGTTATATGACTACGTGGT 
TACGGAAAGATTCCTCTGTTTCGTCATTGTATTAAAATTTAATCCCACAAGTCAAACATA 
CTGTACATTATTCTTAATTTAGTATTTTCTTATTAATATCTATCATTTGTTTGGTGAACA 
CCAGAATATTAGACTATTAATGTAACGAGTTTTTAATATTTCGATCATAATAACACCAAG 
CTAGTTAAAGGTTAATATCTTGTTACGAAGTCTTGAGTAAGTTCAATTGTCATATATATG 
TAACGGAAGAGGTTCGTTCGGGTCCCAAGTGAAGTGGATCAAAGGTGACTTCACATAAAA 
AATAAAAAAAA 

>G1277 Amino Acid Sequence (domain in AA coordinates: 18-85) 
l^AGVAVKADVAVI^K^ERPFKGIRMRKWGKWAEIREPNKRSRLWLGSYSTPEAAARAY 
DTAVFYLRGPTATLNFPELLPCTSAEDMS AATI RKKATEVGAQVDAI GATWQNNKRRRV 
FSQKRDFGGGLLELVBLNKLPDPENLDDDLVGK* 
>G1309 (53.. 859) 

CGTCGACCTCTTAATTAAGACGACTTGAGAGAGA7\AGAAAGATACGTGGAAGATGACCAA 
ATCTGGAGAGAGACCAAAACAGAGACAGAGGAAAGGGTTATGGTCACCTGAAGAAGACCA 
GAAGCTCAAGAGTTTCATCCTCTCTCGTGGCCATGCTTGCTGGACCACTGTTCCCATCCT 
AGCTGGATTGCAAAGGAATGGGAAAAGCTGCAGATTAAGGTGGATTAATTACCTAAGACC 
AGGACTAAAGAGGGGGTCGTTTAGTGAAGAAGAAGAAGAGACCATCTTGACTTTACATTC 
TTCCTTGGGTAACAAGTGGTCTCGGATTGCAAAATATTTACCGGGAAGAACAGACAACGA 
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GATTAAGAACTATTGGCATTCCTATCTGAAGAAGAGATGGCTCAAATCTCAACCACAACT 
CAAAAGCCAAATATCAGACCTCACAGAATCTCCTTCTTCACTACTTTCTTGCGGGAAAAG 
AAATCTGGAAACCGAAACCCTAGATCACGTGATCTCCTTCCAGAAATTTTCAGAGAATCC 
AACTTCATCACCATCCAAAGAAAGCAACAACAACATGATCATGAACAACAGTAATAACTT 
GCCTAAACTGTTCTTCTCTGAGTGGATC^GTTCTTCAAATCCACACATCGATTACTCCTC 
TG CTTTTAGAG ATTCCAAG CACATTAATGAAACTCAAGATCAAATCAATG AAGAGGAAGT 
GATGATGATCAATAACAACAACTACTCTTCACTTGAGGATGTCATGCTCCGTACAGATTT 
TTTG CAG CCTGATCATGAAT ATGCAAATTATTATTCTTCTGGAGATTTCTTCATCAACAG 
TGACCAAAATTATGTCTAAGAAGAGTGAATATGATCGTAAGAGGAACATAAGCTAGTTAC 
TTGTGTTACAGC 

>G1309 Amino Acid Sequence (domain in AA coordinates: 9-114) 
MTKSGERPKQRQRKGLW S PEEDQKLKSF ILSRGHACWTTVP I LAGLQRNGKSCRLRWINY 
LRPGLKRGSFSEEEEETILTLHSSLGfcnCWSRIAKYLPGRTDNBIK^^ 
PQLKSQISDLTESPSSLLSCGKRNLETETLDHVISFQKFSENPTSSPSKESNl^II^S 
NNLPKLFFSEW I S SSNPHID YS S AFTDSKHINETQDQ INEEEVMM INNNNYS SLEDVMLR 
TDFLQPDHEYANYYS SGDFFINSDQNYV* 
>G1314 (1..990) 

ATGGGAAGAGCTCCGTGTTGCGACAAGACAAAAGTGAAGCGAGGGCCTTGGTCGCCTGAA 
GAAGACTCTAAACTTAGAGATTACATTGAAAAGTATGGTAATGGTGGAAATTGGATCTCT 
TTCCCCCTCAAAGCCGGTTTGAGGAGATGTGGGAAGAGTTGTAGACTGAGGTGGCTAAAC 
TATTTGAGACCAAACATAAAGCATGGTGACTTCTCTGAGGAAGAAGACAGGATCATTTTT 
AGTCTCTTCGCTGCCATAGGAAGCAGGTGGTCAATAATAGCAGCTCATCTACCGGGACGA 
ACAGACAACGACATAAAAAACTATTGGAACACAAAGCTAAGGAAGAAACTCTTGTCTTCT 
TCCTCTGATTCATCATCATCAGCCATGGCTTCTCCTTATCTAAACCCTATTTCTCAGGAT 
GTGAAAAGACCAACCTCACCAACAACAATCCCATCTTCTTCTTACAATCCGTATGCTGAA 
AACCCTAATCAATACCCAACAAAATCCCTCATCTCCAGCATCAATGGCTTCGAAGCTGGT 
GACAAACAGATAATTTCCTATATTAACCCTAATTATCCTCAAGATCTCTATCTCTCGGAC 
AGCAACAACAACACCTCGAACGCAAATGGTTTCTTGCTCAACCACAATATGTGTGATCAG 
TACAAGAACCACACCAGTTTTTCTTCAGACGTCAATGGGATAAGATCAGAGATTATGATG 
AAGCAAGAAGAGATAATGATGATGATGATGATAGACCACCACATTGACCAGAGGACAAAA 
GGGTACAATGGGGAATTCACACAAGGGTATTATAATTACTACAATGGGCATGGGGATTTG 
AAGCAAATGATTAGTGGAACAGGCACTAATTCTAACATAAACATGGGTGGTTCAGGTTCA 
TCTTCTAGTTCGATAAGCAACCTAGCTGAGAACAAAAGCAGTGGTAGCCTCCTACTAGAA 
TACAAATGCTTGCCCTATTTCTACTCCTAG 

>G1314 Amino Acid Sequence (domain in AA coordinates: 14-116) 
MGRAPCCDKTKVKRGPWSPEEDSKLRDYIEKYGNGGNWISFPLKAGLRRCGKSCRLRWLN 
YLRPNI KHGDFS EEEDRI I FS LFAAIGSRWS 1 1 AAHLPGRTDND I KNYWNTKLRKKLLS S 
SSDSSSSAMASPYLNPISQDVKRPTSPTTIPSSSYNPYAENPNQYPTKSLISSINGFEAG 
DKQIISYINPNYPQDLYLSDSNl^TSNANGFLLNHNMCDQYKNHT 

KQEE IMMMMMIDHH IDQRTKGYNGEFTQGYY1JYYNGHGDLKQMI SGTGTNSNINMGGSGS 
S S SS ISNLAENKS SGSLLLE YKCLPYFYS * 
>G1317 (1..849) 

ATGGGAAGATCACCTTGTTGTGATAAAAATGGAGTGAAGAAGGGACCATGGACTGCTGAG 
GAGGATCAGAAACTCATCGATTATATTCGATTTCATGGTCCTGGCAATTGGCGTACGCTC 
CCGAAAAATGCTGGACTCCATAGATGTGGAAAAAGCTGCCGTCTTCGATGGACCAAT?TAT 
CTAAGACCGGACATCAAGAGAGGAAGATTCTCGTTCGAGGAAGAAGAAACTATCATTCAG 
CTACACAGTGTTATGGGAAACAAGTGGTCAGCAATAGCCGCTCGTCTACCAGGGAGGACC 
GATAACGAAATAAAAAACCATTGGAACACTCACATCCGCAAGAGACTTGTAAGGAGTGGT 
ATCGACCCTGTTACTCATTCTCCACGCCTTGATCTTCTTGATTTGTCCTCACTTTTGAGT 
GCACTTTTCAACCAGeCAAACTTTTCAGCAGTTGCAACACATGCGTCTTCTCTTCTTAAT 
CCTGATGTATTGAGGTTGGCCTCTCTACTACTGCCACTTCAAAACCCTAATCCAGTTTAC 
CCATCGAACCTCGACCAAAATCTTCAAACTCCAAATACATCATCAGAATCGTCTCAACCA 
CAAGCTGAGACTAGTACAGTCCCAACAAACTATGAAACTTCATCATTGGAGCCTATGAAC 
GCAAGACTCGACGACGTTGGTCTTGCAGATGTATTACCACCTTTGTCAGAGAGTTTTGAC 
TTAGACTCGCTCATGTCAACGCCAATGTCTTCTCCACGACAAAATAGCATTGAAGCAGAA 
ACCAACTCCAGCACTTTCTTCGACTTTGGAATTCCGGAAGATTTCATCTTAGATGACTTT 

ATGTTTTAA 
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>G1317 Amino Acid Sequence (conserved domain in AA coordinates : 13-118) 
MGRSPCCDK2JGVKKGPWTAEEDQKLIDYIRFHGPGNWRTLPKNAGLHRCGKSCRLRWT^ 
LRPD I KRGRFS FBEEETI IQLHS VMGNKW S AI AARLPGRTDNE I KNHWNTHIRKRLVRS G 
IDPVTHSPRIiDLLDLSSLLSALFNQPNFSAVATHASSLLN^ 

PSNLDQNLQTPNTSSESSQPQAETSWPTNYETSSLEPMNARIJ)DVGIiADVLPPLSESFD 

LDSLMSTPMSSPRQNSIEAETNSSTFFDFGIPEDFILDDFMF* 

>G1323 (49.. 870) 

AAGAGGGAATCTCAAAAGTGTGTGTCTGTGAGAGAGGAGAGAGAGAATATGGGCAAAGGA 

AGAGCACCATGTOGTGACAAAACCAAAGTGAAGAGAGGACCATGGAGCCATGATGAAGAC 

TTGAAACTCATCTCTTTCATTCACAAGAATGGT(^TGAGAATTGGAGATCTCTCCCAAAG 

CAAGCTGGATTGTTGAGGTGTGGCAAGAGTTGTCGTCTGCGATGGATTAATTACCTCAGA 

CCTGATGTGAAACGTGGCAATTTCAGTGCAGAGGAAGAAGACACCA^ 

CAGAGCTTTGGTAACAAGTGGTCGAAGATTGCTTCTAAGCTGCCTGGAAGAACAGACAAT 

GAGATCAAGAATGTGTGGCATAC^CATCTCAAGAAAAGATTGAGCTCGGAAACTAACCTT 

AATGCCGATGAAGCGGGTTCAAAAGGTTCTTTGAATGAAGAAGAGAACTCTCAAGAGTCA 

TCTCCAAATGCTTCAATGTCTTTTGCTGGTT^ 

CAGATAAGTCAAATGTTTGAGCACATTCTAACTTATAGCGAGTTTACGGGGATGTTACAA 
GAGGTAGACAAACCAGAGCTGCTGGAGATGCCTTTTGATTTAGATCCTGACATTTGGAGT 
TTCATAGATGGTTCAGACTCATTCGAACAAC^ 

GAAGATGAAGTTGATAAATGGTTTAAGCACCTGGAAAGCGAACTCGGGTTAGAAGAAAAC 
GATAACCAACAACAACAACAGCATAAACAGGGAACAGAAGATGAACATTCATCATCACTC 
TTGGAGAGTTACGAGCTCCTCATACATTAATGAAGCCATAAAGCAAGTCATTTTCACCTT 
GAAAATGGAATTATTAGCTAACTTATTGGCATTATTAGTATATAAGCAAGATCAGATAGG 
CGCATGTAGTAGCAACAACGAAGAAACGTCGAATTGTAGACAAAATGTAGATATTACAGA 
GTTGAAAGATTGTATTTTGCAAATGATTGCTTTGTAGTGAAATCAAGTTATCACAAAAAA 
AAAAAAAA 

>G1323 Amino Acid Sequence (domain in AA coordinates: 15-116) 

MGKGRAPCCDKTKVKRGPWSHDEDLKL I S F IHKNGHENWRSLPKQAGLLRCGKS CRLRWI 

NYLRPD VKRGNF S AEEEDTI I KLHQS FGNKWS KI AS KLPGRTDNE I K3STVWHTHIjKKRL S S 

ETNLNADEAGSKGSLNEEENSQESSPNASMSFAGSNISSKDDDAQISQMFEHILTYSEFT 

GMLQEVDKPELLEMPFDLDPDIWSFIDGSDSFQQPENRALQESEEDEVDKWFKHLESELG 

LEENDNQQQQQHKQGTEDEHSSSLLESYELLIH* 

>G1332 (1..606) 

ATGGAATGCAAAAGAGAAGAAGGGAAGTCTTACGTGAAGAGAGGGTTGTGGAAACCAGAA 
GAAGATATGATATTAAAAAGCTATGTTGAGACTCATGGTGAAGGAAACTGGGCAGACATT 
TCTCGTAGATCCGGGTTGAAGAGAGGAGGAAAAAGCTGTAGGCTGAGATGGAAGAACTAT 
CTAAGACCAAATATCAAAAGAGGAAGCATGTCACCACAAGAACAAGACCTTATCATCCGC 
ATGCATAAGCTTCTTGGAAACAGATGGTCGTTGATCGCTGGTCGCCTTCCAGGTCGTACT 
GACAATGAAGTGAAGAACTACTGGAATACTCATTTGAACAAGAAACCTAATTCCCGAAAA 
C^GAATGCACCTGAATC7ATCGTCGGCGCCACTCCTTT(^CGGATAAGCCAGTTATGTCT 
ACAGAACTGAGAAGAAGCCATGGAGAAGGAGGAGAAGAGGAGAGCAATACCTGGATGGAG 
GAGACCAACCACTTTGGCTATGACGTCCACGTAGGATCTCCCTTGCCACTTATTTCCCAC 
TACCCAGACAACACTCTCGTGTTTGACCCATGTTTTTCCTTTACCGATTTCTTTCCTCTG 
CTTTAG 

>G1332 Amino Acid Sequence (conserved domain in AA coordinates : 13-116) 
MECKREEGKS YVKRGL WKPEEDM I LKS YVETHGEGNWAD I SRRSGLKRGGKS CRLRWKNY 
LRPNIKRGSMSPQEQDLIIRMHKLLGNRWSLIAGI^ 

QNAPESIVGATPFO^KPVMSTELRRSHGEGGEEESNTWMEETNHFGYDV^ 

YPDNTLVFDPCFSFTDFFPLL* 

>G1334 (76.. 885) 

ATAGCTCCCAACTAATAGGAATCTCAAGCTTCTCACTCTCTCTTGTTTTTCCATTGGACT 
TTTGGAACATAAGCTATGCAAACTGAGGAGCTTTTGTCGCCACCACAGACTCCTTGGTGG 
AATGCTTTTGGATCTCAGCCGTTGACTACAGAGAGCCTTTCCGGCGAAGCTTCTGATTCA 
TTCACCGGAGTTAAGGCAGTTACTACGGAGGCAGAACAAGGTGTGGTGGATAAACAAACT 
TCTACAACTCTCTTCACTTTCTCACCTGGTGGTGAAAAGAGTTCAAGAGATGTGCCAAAG 
CCTC^TGTTGCTTTCGCGATGCAATCAGC^^ 

ATGTACACAAAGCATCCTCATGTTGAACAATACTATGGAGTTGTTTCAGCATACGGATCT 



158 



WO 03/013227 PCTAJS02/25805 

159/286 



CAGAGGTCTTCGGGCCGAGTAATGATTCCACTGAAGATGGAGACAGAAGAAGATGGTACC 

ATCTATGTGAACTCAAAGCAGTACCATGGAATTATCAGGCGACGCCAGTCCCGAGCAAAG 

GCTGAAAAACTGAGTAGATGCCGTAAGCCATATATGCATCACTCACGCCATCTCCATGCT 

ATGCGCCGTCCTAGAGGATOTGGCGGGCGTTTCTTGAACACCAAGACAGCTGATGCGGCT 

AAGC^GTCTAAGCCGAGTAATTCTCAGAGTTCTGAAGTCTTTCATCCGGAAAATGAGACC 

ATAAACTCATCGAGGGAAGCAAATGAGTCAAATCTCTCGGATTCTGCAGTTACAAGTATG 

GATTACTTTCTAAGTTCGTCGGCTTATTCTCCTGGTGGCATGGTCATGCCTATCAAGTGG 

AATGCAGCAGCAATGGATATTGGCTGCTGCAAACT^ 

CAAGAC^TGATTGGTCACCAGTCCTTTTGTCTTGTCCCTTATCTTTCAG 

GAGAACTTGTGTCTTGGAAAAAAGACATTGAGTTTCCTTGGTTTATAAGATTGGTCCTTT 

TACCATCCGTTTGGCTGTAAACAGGCAAATCATCTTT^ 

ATCTTCGTCTGTTTTCTTCTACGCATCTTCATAAGATCTCTGAACTAGTGAATAACATTT 
CCTAGCATCATGTTTCAACTAGTGTGTGTTGTAAGAAA 

GTATTGTGTGTAACGTGTTTATGAAACAAACGTAAGACTTTCAAGTTAAAAAAAAAAAAA 
AAAAAAAAAAAAAA 

>G1334 Amino Acid Sequence (domain in AA coordinates: 18-190) 
MQTEELLSPPQTPWWNAFGSQPLTTESLSGEASDSFTG^ 

TFSPGGEKSSRDVPKPHVAFAMQSACFEFGFAQPMMYTKHPHVEQYYGWSAYGSQRSSG 
RVMIPLKMETEEDGTIYWSKQYHGIIRRRQSRAKAEKLSRCR 

GSGGRFLNTKTADAAKQSKPSNSQSSEVFHPENETINSSREANESNLSDSAVTSMDYFLS 
S SAYS PGGMVMP I KWNAAAMD IGCCKLNI * 
>G1381 (32. .802) 

C^GCTTTAACACTACTCTCTCTCTCTCTCAAATGGGAAAACAAATCAACATAGAGAGTAG 

TGCTACTCATCATCAAGACAATATTGTTTCCGTTATAACAGCCACGATATCCTCCTCCTC 

CGTCGTAACGTCTTCGTCAGACTCTTGGTCTACCTCCAAAAGATCGTTAGTGCAAGACAA 

TGACTCCGGAGGGAAACGGCGGAAGAGCAACGTTAGTGATGATAACAAGAATCCGACGTC 

GTATAGAGGAGTGAGGATGAGGAGTTGGGGAAAATGGGTGTCGGAGATTAGAGAGCCGAG 

GAAGAAATCAAGAATATGGCTTGGCACTTATCCAACGGCAGAGATGGCAGCTCGTGCTCA 

TGATGTGGCGGCTTTAGCTATTAAAGGCAACTCCGGTTTTCTTAATTTCCCTGAATTATC 

CGGTTTGCTTCCTCGTCCGGTTAGCTGCTCTCCTAAGGATATACAAGCTGCAGCTACCAA 

AGCCGCCGAAGCAACCACGTGGCACAAACCGGTTATCGATAAGAAATTAGCTGATGAGCT 

J\AGCCACTCTGAGTTGTTGTCTACCGCTCAGTCTTCGACTTCTAGTAGTTTCGTGTTTTC 

TTCGGACACGTCGGAGACTTCTAGTACGGACAAGGAAAGCAACGAAGAGACGGTGTTTGA 

TTTGCCGGACCTTTTCACGGACGGGCTTATGAACCCAAACGATGCGTTTTGTTTATGCAA 

CGGCACCTTTACGTGGCAGCTTTACGGAGAGGAGGATGTAGGGTTCAGGTTTGAAGAGCC 

GTTTAATTGGCAAAATGACTAAACCGCCCTCCACTTGCTTACTGTAATTACTAACATATA 

ATTTTCTTGATAAAGAACATATATTTCCATTACGGTATTAACTAATCTTTTCTATCCTTT 

TCTCTTTTCTTGTTTCTACATCTGAGTATATTGTCACTATGTGAAAAAATTGATCTCGTT 

TTGAATATTTACTTTTCAT^AATTGAAGTAACGCAAGTGATTGATAAAAAAAAAAAAA 

>G1381 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGKQINIESSATHHQDNIVSVITATISSSSWTSSSDSWSTSKRSLVQDNDSGGKRRKSN 

VSDDNKNPTSYRGVRMRSWGK>WSEIREPRKKSRIWLGT^ 

SGFLNFPELSGLLPRPVSCSPKDIQAAATKAAEATTWHKPVIDKKLADEIjSHSELLSTAQ 
SSTSSSFVFSSDTSETSSTDKESNEETVFDLPDLFTDGLM1TPNDAFCLCNGTFTWQLYGE 
EDVGFRFEEPFNWQND* 
>G1382 (90.. 1763) 

CTCTCATTTCGCCATAGCTGAGAGCTTCTTCTACTTTCCCTTAGCTTCTTTTTTCCTTCA 
TTTTTGTTCTACCCTTGCGAATCTCTGAAATGAACCCTCAAGCTAATGACCGGAAGGAGT 
TTCAGGGAGATTGTTCGGCGACGGGAGATCTCACGGCAAAGCACGATTCAGCTGGAGGAA 
ACGGAGGTGGAGGTGeTAGGTATAAGCTGATGTCACCGGCCAAGCTTCCGATCTCGAGGT 
CGACTGATATCACGATTCCTCCTGGGTTGAGTCCGACTTCGTTTTTGGAATCTCCTGTTT 
TCATCTCCAACATCAAGCCAGAACCTTCCCCTACTACTGGTTCTTTGTTCAAGCCTCGAC 
CAGTGCACATTTCTGCTAGCTCAAGTTCTTATACAGGCAGGGGGTTCCATCAGAACACCT 
TTACTGAGCAGAAGTCCAGTGAATTTGAGTTCAGACCTCCTGCATCAAATATGGTATATG 
CAGAGCTTGGCAAGATTAGAAGTGAGCCACCAGTACATTTTCAAGGCCAGGGCCATGGAT 
CCTCACACTCACCTTCTTCGATCAGTGATGCTGCAGGTTCCTCAAGTGAGCTAAGCCGGC 
CAACTCCTCCTTGTCAGATGACACCAACGAGCTCAGATATTCCGGCTGGATCTGATCAAG 
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AGGAATCAATCCAGACTTCCCAAAATGACTCCAGAGGAAGCACTCCATCCATCTTGGCTG 

ATGATGGTTATAACTGGAGAAAATATGGTCAAAAGCATGTCAAAGGGAGTGAATTTCCCC 

G GAG CTATTATAAATGTACACATCCTAATTGTGAAGTG AAAAAGTT ATTTGAAAG ATC TC 

ATGATGGGCAGATCACCGATATTATATACAAGGGTACACATGACCATCCTAAACCTCAAC 

CTGGTCGCCGAAACTCTGGTGGTATGGCTGCACAAGAAGAAAGGCTAGACAAGTATCCTT 

CTTC^CTGGCCGAGATGAGAAGGGATCTGGCGTCTACAACTTGTCTAACCCCAATGAAC 

AAACTGGTAACCCTGAAGTACCTCCTATCTCAGCATCTGACGATGGTGGAGAAGCGGCAG 

CGTCAAATAGGAATAAAGATGAGCCGGACGATGATGATCCATTCTCAAAACGGAGGAGGA 

TGGAGGGTGCGATGGAAATAACTCCACTAGTGAAACCCATCCGGGAGCCTCGGGTTGTTG 

TTGAAACTCTGAGTGAGGTTGACATTCTGGATGATGGTTATAGATGGCGCAAATATGGGC 

AGAAAGTCGTAAGGGGGAACCCAAATCCCAGGAGCTACTACAAATGCACAGCTCATGGAT 

GCCCAGTGAGAAAACACGTGGAGAGAGCATCACATGATCCAAAAGCTGTAATAACAACAT 

ACGAAGGCAAACACGATCATGATGTTCCCACTTCAAAGTCTAGCAGCAATCACGAAATCC 

AGCCTCGGTTCAGACCAGATGAAAC^GACACCATCAGCCTC^TCTTGGTGTTGGAATCT 

CATCTGATGGACCTAACGACGCTTCCAACGAACATCAGCACCAGAATCAACAACCT 

ACCAAACTCACCCAAATGGAGTCAATTTCAGGTTTGTTCATGCTAGTCCCATGTCATCCT 

ACTATGCTAGCTTAAATAGCGGTATGAATCAGTACGGCCAGAGAGAAACAAAGAACGAGA 

CTCAAAATGGTGACATCTCGTCCTTGAACAATTG^^ 

GGAGAGTA(^TCGGGTCCGTAAAACAAAAAGTAAGCAACATTATGTACGGGATCTTCTT 

AGGTTAGGAATGGGACGAGGCCTTGTTCTATATAATTCCTATTTCTTCACAGAGAGCT 

TCTTGATTCAAACTATCTCCACCATATATATTTGTTTGTGTCACCTGTATTGAGTTCCAA 

AAATGTTATGTAAAAATACACAACAAGATGTTAATGCTTTTATTTAAACAAGAAACAGCA 

ATATTACTACAAAAAAAAAAAAAAAAAA 

>G1382 Amino Acid Sequence (domain in AA coordinates: 210-266, 385-437) 

^PQANDRKEFQGDCSATGDLTAKHDSAGGNGGGGARYKLMSPAKLPISRSTDITIPPGL 

SPTSFLESPVFISNIKPEPSPTTGSIiFKPRPVHISASSSSYTGRGFHQNTFTEQKSSEFE 

FRPPASNMVYAELGKIRSEPPVHFQGQGHGSSHSPSSISDAAGSSSELSRPTPPCQMTPT 

SSDIPAGSDQEESIQTSQNDSRGSTPSILADDGYNWRKYGQKHVKGSEFPRSYYKCTHPN 

CEVKKLFERSHDGQITDI I YKGTHDHPKPQPGRRNSGGMAAQEERLDKYPSSTGRDEKGS 

GVYNLSNPNEQTGNPEVPPISASDDGGEAAASNRNKDEPDDDDPFSKRRRMEGAME 

VKPIREPRVWQTLSEVDILDDGYRWRKYGQKVVRGNPNPRSYYKCTAHGCPTOKHVERA 

SHDPKAVITTYEGKHDHDVPTSKSSSNHEIQPRFRPDETDTISLNLGVGISSDGPNHASN 

EHQHQNQQLVNQTHPNGVNFRFVHAS PMS S YYAS LNSGMNQYGQRETKNETQNGD I S SLN 

NS SYPYPPNMGRVQSGP * 

>G1435 (8.. 904) 

GTGAAACATGGGGAAGGAAGTTATGGTGAGCGATTACGGTGACGACGACGGAGAAGACGC 
CGGCGGCGGCGATGAATATAGGATTCCGGAATGGGAAATTGGTTTACCCAACGGAGATGA 
TTTGACTCCGTTATCTCAATATCTAGTCCCGTCGATTCTCGCGTTAGCTTTCAGCATGAT 
CCCAGAACGAAGCCGTACAATTCACGACGTCAATCGCGCGTCGCAAATCACGCTCTCTTC 
GTTGAGAAGCAGTACCAATGCTTCGTCTGTGATGGAGGAGGTCGTGGATCGAGTTGAATC 
GAGTGTTCCAGGATCAGATCCGAAGAAACAGAAGAAATCGGATGGTGGTGAAGCAGCGGC 
GGTGGAGGATTCCACGGCGGAGGAAGGAGACTCCGGGCCTGAAGACGCGTCTGGGAAGAC 
ATCGAAACGACCGCGTTTAGTGTGGACACCGCAGCTACACAAGAGATTTGTGGACGTTGT 
GGCTCATCTAGGGATTAAAAACGCAGTGCCGAAGACGATTATGCAGCTGATGAACGTGGA 
AGGACTTACTCGTGAGAACGTTGCGTCTCATTTGCAGAAATATAGGCTTTACCTTAAACG 
GATTCAAGGATTGACGACGGAAGAAGATCCTTATTCGTCGTCGGATCAGCTCTTCTCTTC 
AACGCCGGTTCCTCCACAGAGCTTTCAAGACGGCGGAGGAAGTAACGGAAAGTTGGGGGT 
TCCGGTTCCGGTTG€GTCGATGGTGCCTATTCCAGGCTATGGGAATCAAATGGGTATGCA 
AGGATATTATCAACAGTATAGTAACCATGGCAATGAATCAAACCAATATATGATGCAGCA 
GAATAAGTTTGGAACAATGGTGACATATCCTTCTGTTGGTGGTGGTGACGTGAATGACAA 
GTAAATGGATCTTAAAGGTCTATAATTTGCTCTACAGAGAGATACTGGTTCTTGGCTTAT 
GGTTTATTTTCCCACTTCATGAGGTTGTTGTGACTTTTAATTCTCCATGTTTTCCACACA 
AGTCTTTATTGCCTTTGTATAGAAAATGATTTC 

GTTGGAGGATGAAGCCTTCTATGAATGATTTAGTTTCCTACTGTCTCCATTCTTTATGAG 
GTAATAAAGCCTTCTTTTGCTCATCGCTTGTAGTCTTCTTAAATTCAAGACAGCGTCACA 
TGTTTGTTCGGTTATGTTAATTGTTTCTTTCTTTGGATAATGAAGATAGCATCAGGTCTC 
ATGTCTCCTCACTTTGATAAA 
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>G1435 Amino Acid Sequence (domain in AA coordinates: 146-194) 
MGKEVMVSDYGDDDGEDAGGGDE YRI PEWE IGLPNGDDLTPLSQYLVPS ILALAFSM I PE 
RSRTIHDVNRASQITLSSLRSSTNASS 

DSTAEEGDSGPEDASGKTSKRPRLVWTPQLHKRFVDWAHLGIKNAVPKTIMQLMNVEGL 
TRENVASHLQKYRLYLKRIQGLTTEEDPYSSSDQLFSSTPVPPQSFQDGGGSNGKLGVPV 
PVPSMVPIPGYGNQMGMQGYYQQYSNHGNESNQYMMQQNKFGTMVTYPSVGGGDVNDK* 
>G1537 (1..783) 

ATGGAAAACGAAGTAAACGCAGGAACAGCAAGCAGTTCAAGATGGAACCCAACGAAAGAT 

C^GATCACGCTACTGGAAAATCTTTACAAGGAAGGAATACGAACTCCGAGCGCCGATCAG 

ATTCAGCAGATCACCGGTAGGCTTCGTGCGTACGGCCATATCGAAGGTAAAAACGTCTTT 

TACTGGTTCCAGAACCATAAGGCTAGGCAACGCCAAAAGCAGAAACAGGAGCGCATGGCT 

TACTTCAATCGCCTCCTCCACAAAACCTCCCGTTTCTTCTACCCCCCTCCTTGCTCAAAC 

GTGGGTTGTGTCAGTCCGTACTATTTAC^GCAAGCAAGTGATCATCATATGAATCAACAT 

GGAAGTGTATACACAAACGATCTTCTTCACAGAAACAATGTG^^ 

TACGAGAAACGGACAGTCACACAACATCAGAAACAACTTTCAGAC^ 

GCCACAAGAATGCCAATTTCTCCGAGTTCACTCAGATTTGACAGATTTGCCCTCCGTGAT 

AACTGTTATGCCGGTGAGGACATTAACGTCAATTCCAGTGGACGGATU^ACACTCCCTCTT 

TTTCCTCTTCAGCCTTTGAATGC^AGTAATGCTGATGGTATGGGAAGTTCCAGTTTTGCC 

CTTGGTAGTGATTCTCCGGTGGATTGTTCTAGCGATGGAGCCGGCCGAGAGCAGCCGTTT 

ATTGATTTCTTTTCTGGTGGTTCTACTTCTACTCGTTTCGATAGTAATGGTAATGGGTTG 

TAA 

>G1537 Amino Acid Sequence (domain in AA coordinates: 14-74) 

menewagtasssr™ptkdqitiiLenlykegirtpsadqiqqitgrlrayghiegknvf 
ywfqnhkarqrqkqkqermayfnrllhktsrffypppcsnvgcvspyylqqasdhhmnqh 
gsvytndllhrnnvmipsggyekrtvtq 

ncyagedinvnssgrktiiplfplqplnasnadgmgsssfalgsdspvdcssdgagreqpf 
idffsggststrfdsngngl* 

>G1545 (67.. 729) 

CATCACCAATCTTTTGAATCTAAGAGAGAGAAGAAGAAGAAGGTCTAGAGAACGAAAAGA 

AGAAACATGAATAACCAGAATGTAGATGATCATAATCTTCTACTCATTTCTCAATTGTAC 

CCTAATGTCTATACTCCATTAGTACCACAACAAGGAGGAGAAGCAAAACCAACACGGCGG 

AGGAAAAGGAAGAGCAAGAGTGTTGTGGTGGCAGAGGAGGGTGAAAACGAAGGCAATGGG 

TGGTTTAGAAAGAGAAAATTGAGTGATGAGCAAGTAAGAATGTTGGAGATTAGCTTTGAA 

GACGATCATAAGCTTGAATCCGAGAGGAAAGATCGGCTTGCTTCTGAGTTAGGGCTTGAT 

CCTCGTCAAGTCGCCGTCTGGTTCCAAAACCGCCGTGCACGGTGGAAGAACAAACGAGTC 

GAGGATGAATACACTAAACTCAAGAATGCATACGAAACCACCGTCGTTGAGAAATGTCGT 

CTTGATTCTGAGGTTATTCACCTAAAGGAACAACTTTACGAGGCTGAAAGAGAGATCCAA 

CGGCTTGCAAAAAGAGTTGAAGGAACTTTAAGTAACAGTCCTATCTCATCCTCTGTGACC 

ATTGAAGCCAATCATACGACACCGTTTTTTGGAGATTACGACATCGGATTTGACGGTGAG 

GCTGACGAGAACTTGCTCTACTCGCCAGATTACATTGATGGATTAGACTGGATGAGCCAA 

TTTATGTAAAAAACTATAAGCTAATCTATTTTCAGTCGTAGTATAG 

>G1545 Amino Acid Sequence (domain in AA coordinates: 54-117) 

Ml^QNVDDHl^LLISQLYPNVYTPLVPQQGGEAKPTO 

RKRKLSDEQVRMLEISFEDDHKLESERKDRLASELGLDPRQVAVWFQNRRARWKNKRVED 
EYTKLKNAYETTVVEKCRLDSEVIHLKEQLYEAEREIQRLAKRVEGTLSNSPISSSVTIE 
ANHTTPFFGDYDIGFDGEADENLLYSPDYIDGLDWMSQFM* 
>G1641 (1..867) 

ATGGAGGTTATGAGACCGTCGACGTCACACGTGTCAGGTGGGAACTGGCTCATGGAGGAA 
ACTAAGAGCGGCGTCGCAGCTTCTGGTGAAGGTGCCACGTGGACGGCGGCAGAGAACAAG 
GGATTCGAGAATGCTTTGGCGGTTTACGACGACAACACTCCTGATCGGTGGCAGAAGGTG 
GCTGCGGTGATTCCGGGGAAGACAGTGAGTGACGTAATTAGACAGTATAACGATTTGGAA 
GCTGATGTCAGCAGCATCGAGGCCGGTTTAATCCCGGTCCCCGGTTACATCACCTCGCCG 
CCTTTCACTCTAGATTGGGCCGGCGGCGGTGGCGGATGTAACGGGTTTAAACCGGGTCAT 
CAGGTTTGTAATAAACGGTCGCAGGCCGGTAGATCGCCGGAGCTGGAGCGGAAGAAAGGC 
GTTCCTTGGACGGAGGAAGAACACAAGCTATTTCTAATGGGTTTGAAGAAATATGGGA7VA 
GGAGATTGGAGAAACATATCTCGGAACTTTGTGATAACGCGAACGCCAACACAAGTAGCT 
AGCCACGCCCAAAAGTACTTCATCCGGCAACTTTCCGGCGGCAAGGACAAGAGACGAGCA 
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AGCATTCACGACATAACCACCGTAAATCTCGAAGAGGAGGCTTCTTTGGAGACCAATAAG 
AGCTCCATTGTTGTTGGAGATCAGCGTTCAAGGCTAACCGCGTTTCCTTGGAACCAAACG 
GACAACAATGGAACACAGGCAGACGCTTTCAATATAACGATTGGAAACGCTATTAGTGGC 
GTTCATTCATACGGCCAGGTTATGATTGGAGGGTATAACAATGCAGATTCTTGCTATGAC 
GCCCAAAACACAATGTTTCAACTATAG 

>G1641 Amino Acid Sequence (domain in AA coordinates: 139-200) 
MEVMRPSTSHVSGGNV^MEETKSGVAASGEGATW^ 

AAVI PGKTVSDVIRQ YNDLEADVS S IEAGLI PVPGYITS PPFTLDWAGGGGGCNGFKPGH 
QVCNKRSQAGRS PELERKKGVPWTEEEHKLFLMGLKKYGKGDWRNI SRNFVITRTPTQVA 
SHAQKYF IRQLSGGKDKRRAS IHD I TTVNLEEE AS LETNKS S I WGDQRSRLTAFP WNQT 
DNNGTQADAFNIT I GNAI SGVHS YGQVM IGGYNNADS CYDAQNTMFQL* 
>G165 (19.. 699) 

CTTCAAAAC^TCTAAAAAATGGTGAAAAAAACTCTTGGTCGTAGAAAGGTAGAGATAGTG 

AAAATGACTAAGGAATCAAACCTTCAAGTCACATTTTCCAAGAGAAAAGCTGGTCTTTTT 

AAGAAGGCTAGTGAATTTTGCACATTATGTGATGCAAAAATTGCGATGATCGTGTTTTCA 

CCAGCTGGAAAAGTATTTTCTTTTGGTCATCCAAATGTTGATGTTCTGCTTGACC 

CGAGGGTGTGTTGTAGGACACAACAACACAAACCTTGATGAAAGCTACACAAAGCTTCAT 

GTTCAAATGCTCAACAAATCCTAC^CTGAGGTGAAGGCGGAAGTAGAAAAAGAACAAAA 

AATAAGCAGTCGCGGGCTCAAAATGAAAGAGAAAACGAAAACGCTGAGGAGTGGTGGAGT 

AAGTCTCCATTAGAACTCAACTTAAGTCAATCAACCTGTATGATACGTGTTCTTAAAGAT 

TTGAAGAAGATAGTTGATGAAAAAGCAATTCAATTAATCCATCAAACAAACCCAAACTTC 

TATGTTGGAAGTTCTAGCAATGCTGCTGCTCCAGCAACTGTTAGTGGTGGTAATATCTCC 

ACAAACCAGGGGTTCTTTGATCAAAACGGAATGACGACTAATCCTACTCAAACACTTCTG 

TTTGGATTTGATATTATGAATCGCACACCAGGAGTTTAAATAAGTCTATCCTCATTATGG 

GTCTTGGTACTATAAGTTCATCTCTCTCGTTGTTGACTTTTTAAGTCTCCAATAGTTTGT 

TGTG 

>G165 Amino Acid Sequence (conserved domain in AA coordinates : 7-62) 

MVKKTLGRRKVEIVKMTKESNLQWFSK^ 

SFGHPNVDVLLDHFRGCWGHNNTNM^ 

QNERENENAEEWWSKSPLELNLSQSTCMIRVLKDLKKIVX)EKAIQLIHQTNPNFWGSSS 
NAAAPATVSGGNI S TNQGFFDQNGMTTNPTQTLL FGFD IMNRTPGV* 
>G1652 (77.. 1078) 

AGCAAGTCCAAATCTCCCTCTCTCTCTCTCTATCTATCTCTCTATAGAAGATTTTTTAAC 

TAAGAAGCTAGCGATCATGGCCACAGCGATGAACGTTTTCTCTACCAAATGGTCCTCCGA 

ATTGGATATAGAAGAATATAGTATCATCCACCAATTCCACATjGAACTCACTCGTCGGAGA 

TGTTCCACAGTCTCTCTCATCTCTTGATGATACCACCACTTGTTATAACCTTGATGCTTC 

TTGTAATAAAAGTTTGGTAGAAGAAAGACCTTCAAAGATCCTCAAGACCACTCACATATC 

ACCAAACTTACATCCTTTTTCTTCTTCTAATCCTCCTCCTCCAAAGCACCAGCCCTCTTC 

TAGGATTCTTTCTTTTGAAAAGACAGGTTTACATGTTATGAATGACAACTCTCCAAACTO 

AATATTTAGCCCCAAGGACGAAGAAATTGGATTACCAGAGCATAAGAAAGCCGAGCTGAT 

AATAAGAGGGACAAAGAGAGCTCAATCCTTGACTCGAAGCCAATCAAATGCTCAAGATCA 

CATACTGGCAGAGAGAAAACGGAGAGAGAAGCTTACTCAAAGATTTGTAGCTCTTTCCGC 

GCTAATTCCTGGCCTAAAGAAGATGGACAAGGCTTCTGTGTTGGGAGATGCAATAAAGCA 

TATAAAGTACCTCCAAGAGAGTGTGAAAGAGTATGAGGAACAAAAGAAGGAAAAGACAAT 

GGAATCAGTGGTTCTTGTAAAGAAGTCTAGTCTGGTTTTAGATGAAAATCATCAACCATC 

ATCATCATCTTCCTCAGATGGAAATCGCAATAGCTCGAGCTCAAATCTTCCAGAAATAGA 

AGTTAGGGTTTCAGGAAAAGATGTTCTTATTAAGATCCTATGCGAGAAGCAAAAGGGTAA 

TGTGATCAAGATTA5GGGGGAGATTGAAAAGCTTGGTTTGTCTATCACCAACAGCAATGT 

CTTGCCCTTTGGACCCACTTTTGACATCTCTATTATCGCTCAGAAGAATAACAATTTTGA 

TATGAAAATCGAGGATGTTGTGAAGAACTTGAGTTTTGGCTTATCAAAGCTCACTTAATT 

GGTTTCACGTTACATACATATACACATTCATCATCGATTTCTCCGATCGAAGAATCCAAA 

ATCAGTTTTTCCATGAAAGTGGTTTTTTAGTTGTTAAGTTTGTTGTATGGAGATTCTTAA 

GTCATTTAAAGATCCTTGTTCTTGTGTTGTTAAGTGTGCTTTAAGATGCATATCATCAAA 

TGTTTAGTAATTATTTCTCTCCAGTTTCATTTGGGACGGAATTTTTTTCGCAGTTGTTGG 

ATATATATTTCCTGCGATGTAAAGCATTTCGTTAGTTTAATAAACGTCCGATATGTTTCT 

TTGAAAA 

>G1652 Amino Acid Sequence (domain in AA coordinates : 143-215) 
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MATAMNVFSTKWSSBIJDIEEYSIIHQFHMNS 

VEERPSKILKTTHISPNLHPFSSSNPPPPKHQPSSRILSFEKTGLHVMNHNSPNLIFSPK 
DEEIGLPEHKKAELIIRGTKRAQSLTRSQSNAQDHILAERKRREKLTQRFVALSALIPGL 
KKMDKASVLGDAIKHIKYLQESVKEYEEQKKECT^ 

DGNRNSSSSNLPEIEVRVSGKDVLIKILCEKQKGNVIKIMGEIEKLGLSITNSNVLPFGP 
TFDIS I IAQKNNNFDMKIEDWKNLSFGLSKLT* 
>G1655 (132.. 755) 

TTTCTAACTAGTCACATTGAGAGAGAGAGAGAGAGAAAGAGAGACTCTCAGAATCTGAAG 
AAGAAGAAGAGATTGTTGTTTTTGCCTTTTATCATCGGTTTCTTTGAATCTCTGGTTTTA 
AATCGGATTTAATGGTGGAGTCTCTGTTCCCGAGCATCGAAAACACAGGTGAATCGTCTC 
GAAGAAAGAAGCCGAGGATATCAGAGACGGCGGAGGCGGAGATAGAGGCACGACGTGTCA 
ACGAAGAAAGCTTGAAGAGATGGAAAACGAATCGTGTGCAACAGATCTACGCTTCTAAGC 
TCGTCGAAGCTTTACGCCGAGTTCGTC^G&G 

ATAAACTCGTCTCCGGCGCGGCGAGGGAGATACGTGATACGGCX3GATCGAGTTCTAGCTG 
CGTCCGCTCGTGGTACGACTCGGTGGAGCAGAGCGATTTTAGCGAGTCGCGTCCGAGCGA 
AGCTGAAGAAACATAGAAAGGCGAAAAAGTCAACGGGAAATTGTAAATCGAGAAAAGGTC 
TCACGGAGACGAATCGGATTAAGTTACCGGCGGTTGAGAGAAAACTGAAGATTCTTGGCC 
GTTTGGTTCCTGGTTGCCGGAAAGTCTCTC^ 

ACATCGCAGCGTTAGAGATGCAGGTTCGAGCCATGGAGGCTCTCGCCGAACTTTTAACCG 
CAGCCGCACCACGGACGACGTTGACCGGAACTTAACGGCGGCAGTTAGTTTGTCAGTTGT 
TAATTAGCTTTTCTTTTACCTTTTTAC^ 
TCGTCGACGCGATTTTAATTTATTAAATTCA 

>G1655 Amino Acid Sequence (domain in AA coordinates: 134-192) 
MVESLFPSIENTGESSRRKKPRI SETAE AEIEARRVNE 

LRRWQRSSTTSNNETDKLVSGAAREIRDTADRVLAASARGTTRWSRAILASRVRAKLKK 
HRKAKKSTGNCKSRKGLTETNRIKLPAVERKLKILGRLVPGCRKVSVPNLLDEATDYIAA 
LEMQVRAMEALAELLTAAAPRTTLTGT* 
>G1671 (188.. 751) 

TCCCACTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGG 

ACACGCTGACAAGCTGACTCTAGCAGATCTGGTACCGTCGACCCTCTCTATATAATCTTC 

TTCTACACACACACACACACGCAACCATATACGTACATGTGAAGTAGTGAGATCAATA 

GTTAGCAATGAATCTACCACCGGGATTTAGGTTTTTTCCGACCGATGAAGAGCTCGTCGT 

TCACTTCCTCCACCGGAAAGCTTCCCTCTTGCCTTGTCACCCTGATGTCATCCCCGACCT 

TGATCTTTACCATTACGATCCTTGGGACCTTCCCGGGAAAGCTTTGGGAGAAGGGAGGCA 

ATGGTACTTCTATAGTAGAAAGACACAAGAGAGAGTGACAAGCAATGGGTATTGGGGATC 

AATGGGAATGGACGAGCGAATCTACACAAGCTCCACACACAAGAAAGTGGGAATCAAAAA 

GTATCTAACTTTCTATCTCGGAGATTCTCAGACTAATTGGATCATGCAAGAATATTCCCT 

TAGTCACAAACCCGATTATAGCAAGTGGGTGATATGCAGAGTGTATGAGCAAAATTGCAG 
TGAGGAGGAAGACGATGATGGGAC^GAACTCTCATGTTTGGATGAAGTGTTTTTGTCTTT 
AGATGATCTTGACGAAGTAAGCTTACCGTAATAAAGACAGAAGCACCCAAGAAGAGAAAA 
AAAAAAAAAGGGTTTAGTGGGCAATTATTTCTAAGCGACCGCTCTAGACAGGCCTAGTAC 
CGGATCCTCTAGCTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGT 

>G1671 Amino Acid Sequence (domain in AA coordinates: TBD) 
MNLPPGFRFFPTDEELVVHFLHRKASLLPCHPDVIPDLDLYHYDPWDLPGKALGEGRQWY 
FYSRKTQERVTSNGYWGSMG^EPIYTSSTHKKVGIKKYLTFYCGDSQTNWIMQEYSLPD 
SSSSSSRSSKRSSRASSSSHKPDYSKWVICRVYEQNCSEEEDDDGTELSCLDEVFLSLDD 

LDEVSLP* - 
>G1756 (71.. 1003) 

ATATGTACTTGTACAeCAACCCACCAAAAGAGATAAAAGAGGAAACAAAAACTCGAAAAG 
AGAGAGATATATGGGTGAGGTGGCTTATATGGACGAAGGAGACCTAGAAGCAATAGTCAG 
AGGCTACTCCGGCTCCGGAGACGCGTTTTCCGGCGAAAGTTCCGGTACGTTTTCACCTTC 
GTTTTGCCTACCGATGGAGACGTCTAGTTTCTACGAACCGGAGATGGAGACAAGTGGCTT 
AGATGAGCTCGGTGAACTTTACAAACCCTTTTACCCTTTCTCCACACAAACGATCCTCAC 
AAGCTCGGTCTCTCTCCCTGAAGATTCAAAACCTTTCCGAGATGACAAGAAACAACGATC 
ACATGGTTGTCTTTTATCCAACGGATCAAGAGCTGATCATATCCGAATTTCAGAATCCAA 
ATCAAAGAAAAGCAAGAAGAATCAACAGAAGAGAGTTGTTGAGCAAGTGAAAGAAGAGAA 
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TCTGTTGTCGGACGCATGGGCGTGGCGTAAATACGGGCAGAAACCCATCAAAGGATCTCC 

ATACCCAAGGAGTTATTACAGATGCAGTAGCTCTUU^GGGTGTTTGGCAAGAAAACAAGT 

CGAAAGAAATCCTCAAAACCCGGAGAAATTCACCATAACATACACTAATGAGCACAATCA 

TGAACTACCAACCCGGAGAAACTCATTAGCCGGTTCGACTCGAGCAAAAACTTCCCAACC 

CAAACCAACCTTAACCAAAAAATCCGAAAAAGAAGTTGTTTCTTCCCCTACAAGTAATCC 

TATGATCCCATCCGCTGATGAATCTTCTGTTGCGGTTCAAGAAATGAGCGTTGCGGAAAC 

GAGTACGCACCAAGCGGCTGGAGCAATCGAGGGCCGCCX5CTTGAGTAACGGTTTACCATC 

GGATTTGATGTCCGGGAGCGGAACTTTTCCAAGTTTTACCGGTGACTTCGATGAACT 

GAATAGCCAAGAGTTCTTCAGTGGGTATTTATGGAATTACTAGAGAGCATTAGGTGTATG 

TATATATATAT 

>G1756 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGEVAYMDEGDLEAIVRGYSGSGDAFSGESSGTFSPSFCLPMETSSPYBPEMBTSGLDEL 

GELYKPFYPFSTQTILTSSVSLPEDSKPFRDDKKQRSHGCLLSNGSRADHIRISESKSKK 

S KKNQQKRWEQVKEENLLSDAWAWRKYGQKP I KGSP YPRS YYRCS S SKGCLARKQVERN 
PQNPEKFTITYTNEHNHELPTRRNSIAGSTRAKTSQPKPTLTKKSEKEVVSSPTSNPMI P 
SADESSVAVQEMSVAETSTHQAAGAIEGRRLSNGLPSDLMSGSGTFPSFTGDFDELLNSQ 

EFFSGYLWNY* 
>G1757 (250. .1224) 

ATCACC^TCCTATAACACTCTOVTTCTCATCATATCATTCTTCAATCTATATAACCCAT 

TCTTAATTATACTCAACACACATTATATTTTTCTGATCATATCATTCTT^ 

ATATAACG^TTCTTGATTTATACTTAAAACACACATTATACATCTTTCTCATCATAGTT 

TGTATCAATTTCCTAGAGTAAACTACCTAAAGGAAAAAAAAAATCTATTTTGGGAATCAT 

ATACTAAAAATGGAAGGAAGAGATATGTTAAGTTGGGAGCAAAAGACATTGCTAAGCGAG 

CTTATCAATGGATTTGATGCGGCCAAAAAGCTTCAGGCACGACTTAGAGAAGCTCCGTCG 

CCGTCGTCATCATTTTCATCACCGGCGACGGCTGTTGCTGAGACTAACGAGATTCTGGTG 

AAGCAGATAGTTTCTTCCTACGAGAGATCTCTTCTTCTGCTAAACTGGTCATCCTCACCG 

AGCGTACAACTTATTCCGACGCCGGTTACTGTAGTCCCGGTGGCAAATCCCGGCAGTGTT 

CCAGAATCTCCGGCATCGATAAACGGAAGTCCGAGAAGTGAAGAGTTTGCCGATGGAGGA 

GGTTCTAGCGAGAGTCATCATCGCCAAGATTACATTTTCAATTCAAAGAAAAGAAAGATG 

TTACCAAAGTGGTCAGAAAAAGTGAGAATAAGCCGAQAGAGAGGCTTAGAAGGACCTCAA 

GATGATGTCTTTAGCTGGAGAAAATATGGTCAAAAAGACATTTTAGGCGCCAAATTCCCA 

AGGAGTTATTACAGATGCACACATCGTAGCACACAAAACTGTTGGGCAACGAAACAAGTC 

CAGAGATCAGACGGGGATGCTACGGTTTTCGAAGTGACGTACAGAGGAACACACACTTGT 

TCGCAGGCGATCACAAGAACACCACCATTAGCCTCGCCGGAGAAGCGACAAGACACCAGA 

GTCAAACCAGCCATTACCCAAAAGCCAAAGGATATTCTCGAGAGTCTTAAATCCAACTTA 

ACCGTTCGAACCGATGGGCTTGATGATGGTAAAGACGTTTTCTCGTTCCCTGATACGCCG 

CCGTTTTAOy\TTACGGAACTATCAACGGCGAGTTCGGCCACGTGGAGAGTTCTCCGATC 

TTCGACGTTGTTGACTGGTTCAATCCAACGGTCGAGATTGACACAACTTTCCCCGCGTTT 

TTACACGAGTCGATTTATTATTAATTAAAATTTGTAACAGAGAAATAGATAGTAACTAGT 

AAGTAATGATCAGCGAGAGTTAAAACATAAAAGTACTTAGAGTAATCTAACGATGCATAA 

TAAGGAATGTTCAACAGGACTTGAACATGATTTCAATACTAAGAGAGATTTATCTAGCTA 

CTGGTAGTAGCCGCAGACTTCTTGTTGTAGCTTCACTTNCTTTTTGTTGCTT 

>G1757 Amino Acid Sequence (domain in AA coordinates: 158-218) 

MEGRDMLSWEQKTLLSELINGFDAAKKLQARLREAPSPSSSFSSPATAVAETNEILVKQI 

VSSYERSLLLLNWSSSPSVQLIPTPVTVVPVANPGSVPESPASINGSPRSEEFADGGGSS 

ESHHRQDYIFNSKKRKMLPKWSEKVRISPERGLEGPQDDVFSWRKYGQKDILGAKFPR 

YRCTHRSTQNCWATKQVQRSDGDATVFEVTYRGTHTCSQAITRTPPLASPEKRQDTRVKP 

AITQKPKDILESLKSNLIVRTDGLDDGKDVFSFPDTPPFYNYGTINGEFGHVESSPIFDV 

VDWFNPTVEIDTTFPAFLHES I YY* 

>G1782 (1. .927) ■ 

ATGCAAGTGTTTCAAAGGAAAGAAGATTCATCTTGGGGAAACTCAATGCCTACAACAAAT 
TCAAATATTCAAGGATCTGJ\ATCTTTCAGCTTGACTAAGGATATGATAATGTCTACAACA 
CAATTACCCGCGATGAAACATTCGGGTTTGCAGCTGCAAAATCAAGATTCAACCTCATCA 
CAATCTACTGAAGAAGAATCAGGCGGCGGTGAAGTTGCAAGCTTTGGAGAATATAAGCGT 
TATGGATGCAGCATTGTTAATAACAATCTCTCAGGTTACATCGAAAACTTGGGAAAGCCT 
ATTGAAAATTATACTAAGTCAATTACTACCTCGTCGATGGTGTCTCAAGACTCTGTGTTT 
CCTGCTCCTACTTCTGGTCAAATATCTTGGTCTCTTCAATGTGCTGAAACGTCACATTTC 
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AATGGTTTCTTGGCTCCTGAATATGCATCAACACCAACGGCGCTGCCACATTTAGAGATG 

ATGGGTTTGGTTTCTTCAAGAGTGCCATTGCCTCATCACATTCAAGAGAATGAACCAATA 

TTTGTCAATGCGAAACAGTATCATGCGATTCTCCGTCGCAGGAAGCACCGTGCTAAACTC 

GAAGCTCAGAACAAACTCATCAAATGCCGTAAACCGTACCTTCATGAGTCTCGCCATCTT 

(^TGCTTTAAAGAGAGCTAGAGGCTCCGGTGGACGTT^CCTCAATACAAAGAAGCTTCAA 

GAATCATCAAACTCACTGTGTTCTTCTCAAATGGCAAATGGACAAAATTTCTCTATGAGC 

CCTCACGGTGGTGGAAGCGGAATCGGGTCTAGTTCGATCTCACCGAGCTCCAATTCAAAC 

TGTATCAACATGTTCCAAAACCCGCAGTTCAGATTCTCAGGTTATCCGTCAA 

GCCTCAGCTCTCATGTCAGGGACTTGA 

>G1782 Amino Acid Sequence (domain in AA coordinates: 166-238) 
MQWQRKEDSSWGNSMPTTWSNIQGSESFSLTKDMIMSTTQLPAMKHSGLQLQNQDSTSS 
QSTEEESGGGEVASFGEYKRYGCS IVNNNLSGYIENLGKPIENYTKS ITTSSMVSQDSVF 
PAPTSGQISWSLQCAETSHFNGFLAPEYASTPTALPHLEMMGLVSSRVPLPHHIQENEPI 

FVNAKQ YHAI LRRRKHRAKLEAQNKL I KCRKP YLHE S RHLHALKRARG S GGRFLNTKKLQ 
ESSNSLCSSQMANGQNFSMSPHGGGSGIGSSSISPSSNSNCINMFQNPQFRFSGYPSTHH 

ASALMSGT* 

>G184 (327.. 1937) 

TGAATTCTAGCCTTTTTGTAGGCGAATCATCTGGACCGGTAAGAGACTCTCTCATCGATA 

ATAACCACATAATTTAATCAAACTCTTTCTCTCTCTTTCTAAGATCTTTTGCTTTGCTC 

TTTCCTTTTTGATCTTCCTATATATGGAGAAGCACCAAAACGGTACTTACTATACGATAC 

TGTACGGATCCATCAAACTGGATTAATTATCAAAACGTACATTTTTATCTTACCTGGCAA 

GTTACATTCCTAGGGTTTTGGAGAATCCAATCAACAACAAAGAAAATAATCATCGTTACA 

ATAATCAGTATCACGCACAGACTTAGATGTTCCGGTTTCCAGTGAGTCTAGGCGGTTCAC 

GTGACGAAGACCGTCACGATCAGATCACACCGTTGGATGACCATCGTGTGGTGGTTGATG 

AGGTTGACTTCTTCTCAGAGAAGAGAGATAGGGTTTCACGTGAGAACATCAACGACGACG 

ACGACGAAGGCAATAAGGTTCTCATCAAAATGGAGGGTTCACGAGTTGAAGAAAACGATC 

GTTCCAGAGATGTCAATATCGGTCTGAATCTTCTGACCGCGAATACGGGAAGCGATGAGT 

CAACGGTGGATGATGGACTATCAATGGATATGGAAGATAAACGTGCAAAGATTGAGAACG 

CACAACTACAAGAAGAGCTCAAGAAGATGAAAATAGAGAATCAAAGGCTAAGAGATATGT 

TGAGCCAAGCGACGACCAACTTCAATGCCTTACAAATGCAACTTGTTGCCGTCATGAGGC 

AACAAGAACAACGTAACTCTTCACAAGATCATCTCCTGGAGAGCAAAGCAGAAGGAAGGA 

AACGGCAGGAACTGCAAATCATGGTGCCAAGGCAGTTCATGGACCTTGGGCCGTCGTCTG 

GAGCAGCAGAGCATGGAGCCGAAGTGTCATCTGAAGAGAGGACAACGGTTCGTTCAGGTT 

CTCCTCCTTCGCTTCTAGAAAGTTCCAATCCCCGAGAGAACGGAAAGAGGTTGCTTGGAA 

GAGAAGAAAGCTCAGAGGAATCAGAGTCTAACGCCTGGGGAAACCCTAACAAAGTCCCCA 

AACATAATCCATCCTCTAGCAATAGCAATGGAAACAGAAACGGAAATGTTATTGATCAGT 

CGGCCGCAGAAGCCACCATGCGGAAAGCCCGTGTCTCAGTTCGTGCCCGATCTGAAGCTG 

CCATGATAAGCGATGGATGTCAATGGAGAAAGTACGGACAAAAAATGGCTAAAGGAAACC 

CGTGTCCGCGGGCTTATTATCGTTGCACAATGGCCGGTGGATGTCCAGTTCGCAAGCAAG 

TGCAGCGTTGCGCAGAAGACAGATCTATTCTCATAACCACCTACGAAGGAAACCACAACC 

ATCCACTCCCACCAGCCGCTACGGCCATGGCCTCAACAACCACCGCAGCTGCAAGCATGC 

TCCTCTCGGGCTCAATGTCGAGTCAAGACGGTTTAATGAACCCAACAAACCTCCTAGCTC 

GAGCTATCTTGCCTTGCTCCTCAAGCATGGCTACAATCTCAGCCTCCGCACCATTCCCAA 

CCATCACATTGGACCTCACCAATTCACCCAACGGTAACAACCCTAATATGACCACTAATA 

ACCCGTTGATGCAGTTCGCTCAACGGCCCGGTTTCAACCCGGCAGTTTTGCCTCAAGTGG 

TTGGTCAAGCTATGTACAATAACCAACAACAGTCCAAGTTTTCTGGTTTACAGTTACCGG 

CTCAGCCACTGCAGATCGCGGCCACTTCCTCGGTGGCCGAGAGCGTTAGTGCTGCCAGTG 

CAGCAATTGCGTCCGATCCAAACTTTGCGGCGGCTCTAGCGGCAGCGATCACGTCCATTA 

TGAACGGTTCCAGTCATCAAAATAATAACACCAATAATAATAATGTGGCTACGAGCAACA 

ATGACAGTAGGC^UVTAAGAGTTTTCATTTTGATGGTCGATTTTTTyTTTTGGGG 

>G184 Amino Acid Sequence (domain in AA coordinates: 295-352) 

MFRFPVSLGGSMEDRHDQ'ITPLDDHRVVVDEVDFFSEKRDRVSRENINDDDDEGNKVLI 

KMEGSRVEENDRSRDWIGLNLLTANTGSDESTVDDGLSMDMEDKRAKIENAQLQEELKK 

MKIENQRLRDMLS QATTNFNALQMQLVAVMRQQEQRNS SQDHLLE S KAEGRKRQELQIMV 

PRQFMDLGPSSGAAEHGAEVSSEERTTVRSGSPPSLLESSNPRENGKRLLGREESSEESE 

SNAWGNPNKVPKHNP S SSNSNGNRNGNVIDQSAAEATMRKARVS VRARSEAAM I SDGCQW 

RKYGQKMAKGNPC PRAYYRCTMAGGCP WKQVQRCAEDRS I L ITT YEGNHNHPLPPAATA 
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M^TTTAAASMLLSGSMSSQDGLMNPTNLLARAILPCSSSMATISASAPFPTITLDLTNS 
PNGNNPNMTTNNPIjMQFAQRPGFNPAVLPQVVGQAMYNNQQQSKFSGLQLPAQPLQIAAT 
SSVAES VS AAS AAI ASDPNFAAALAAAITS IMNGS SHQNNNTNNNNVATSNNDSRQ* 
>G1845 (111.. 989) 

AAGACATAATTTTCTCTGTTTTCCTAGCTCTC 
TTTTGGCAAATCGTGAACTGCGACGTCTTTAAGG^ 

ACGAGGAGCTAAATCTTTGTATTACGAAAGGTAAAAATGTTGATCATTCTTTTGGAGGAG 
AAGCTTCTTCCACGTCCCCAAGATCTATGAAGAAAATGAAGAGTCCTAGTCGTCCTAAAC 
CCTATTTCC^TCCTCTTCTTCTCCTC^ 

CAACACTTCAGAATCAGCAACAACAACTCGGATCATACGTTCCGGTACTTGAGCAACGAC 
AAGACCCGAC^TGC^GGCCAGAAGCAAATGATCTCCTTTAGTCCTCAACAAO^CAAC 
AGCAGCAGCAGTATATGGCCCAGTACTGGAGTGACACATTGAATCTGAGTCCAAGAGGAA 
GAATGATGATGATGATGAGCCAAGAAGCTGTTCAACCTTACATCGCAACGAAGCTGTACA 
GAGGAGTGAGACAACGTCAATGGGGAAAATGGGTCGCAGAGATCCGTAAGCCACGAAGCA 
GGGCACGTCTTTGGCTTGGTACCTTTGATACAGCTGAAGAAGCTGCCATGGCCTACGACC 
GCCAAGCCTTCAAATTACGAGGCCAGAGCGCAACACTGAATTTC^ 

ATAAGGAAAGCGAGCTGGATGATTCAAACTCGTCGGATCAGAAAGAACCTGAAACGCCAC 
AGCCAAGCGAGGTTAACTTGGAGAGCAAGGAACTACCGGTGATTGATGTTGGGAGAGAGG 
AAGGTATGGCTGAGGCATGGTACAATGCCATTACATCGGGATGGGGTCCTGAAAGTCCTC 
TTTGGGATGATTTGGATAGTTCTCATCAGTTTTCATCAGAAAGCTCATCTTCTTCTCCTC 
TCTCTTGTCCTATGAGGCCTTTCTTTTGAAAAAGTTTATAAACCCACATTGTGTTGTAGG 
TTATAGTTTAGGGTTATGCTCATTGGCATTTGGATGGAGGCAATTTTTGTGATCTCCCAT 
TCCACCACATATCAGTCATTATATGTGTCTACCTTTTCTCTGTATTTCTATCATTATCAT 
TGTTTTTATTATGTGTCTGTATGTGTTTCCCTATTGCTACATACATAGATGTCCTCTTTG 
TTCAAAAAAAAAAAAAAAAAAAAAAA 

>G1845 Amino Acid Sequence (domain in AA coordinates: 140-207) 

MDFDEELNLCITKGKNVDHSFGGEASSTSPRSMKKMKSPSRPKPYFQSSSSPYSLEAFPF 

SLDPTLQNQQQQLGSYVPVLEQRQDPTMQGQKQ 

PRGRMMMMMSQEAVQPYI ATKL YRGVRQRQWG KWVAE IRKPRSRARLWLGTFDTAEEAAM 
AYDRQAFKLRGHSATLNFPEHFVNKESELHDSNSSDQKEPETPQPSEVNLESKE 
GREEGMAEAWYNAITSGWGPESPLWDDLDSSHQFSSESSSSSPLSCPMRPFF* 
>G1879 (3.. 917) 

AAATGCCCTTAGAGGCTGTCGTATACCCGCAAGATCCATTCGGATATCTCTCCAATTGCA 

AAGATTTTATGTTCCACGACTTATACTCTCAAGAAGAGTTCGTAGCTCAAGATACGAAGA 

ACAACATTGATAAGTTAGGGCATGAACAGAGCTTTGTGGAACAAGGTAAGGAGGACGATC 

ATCAATGGCGAGACTATCATC^GTATCCTTTGTTGATCCCTTCGTTGGGAGAAGAGCTTG 

GTCTTACCGCCATTGATGTGGAGAGTCATCCTCCTCCACAGCACCGGAGGAAGAGGAGGA 

GAACGAGAAACTGCAAGAACAAGGAAGAGATCGAGAACCAGAGAATGACTCACATCGCCG 

TCGAGAGAAATCGCCGGAAACAGATGAACGAGTATCTGGCTGTGCTCCGTTCTCTAATGC 

CGTCGTCGTATGCTCAAAGAGGAGATCAAGCGTCGATAGTAGGAGGAGCTATAAACTACG 

TGAAGGAGTTAGAGCATATTTTACAATCTATGGAGCCGAAGAGAACTAGGACTCATGATC 

CCAAAGGAGACAAGACTAGCACTAGCTCGTTAGTGGGTCCATTCACAGATTTTTTCAGCT 

TCCCACAATATTCTACAAAGTCATCATCAGATGTACCGGAAAGCTCATCTTCACCGGCGG 

AGATAGAGGTTACGGTGGCAGAAAGCCATGCGAACATCAAGATAATGACGAAGAAGAAAC 

CGAGGCAGCTTCTTAAGCTCATAACTTCTTTACAAAGCCTAAGGCTCACTCTTCTTCATC 

TCAATGTCACCACTCTCCACAACTCCATTCTCTACTCCATCAGCGTCAGGGTTGAAGAAG 

GAAGCCAACTGAATACCGTGGACGACATTGCAACAGCTTTGAATCAAACCATAAGGAGGA 

TTCAAGAAGAGACA^AATTCAGCAAATAGATTATAATTAACTTGTTTTATTTTTATTTTA 

TTTTGAAATAACTGAAATCAGTTTTCTAATTTTTTTTTTTTTTCACTATTCCTCTAATCC 

TCCCTATGTAAGTTGCATTTTTGTCTCTTGTAATGAATCAATGGTCATAAAG 

AAAAAAATTGAATAAAAGAAAATGGTT 

>G1879 Amino Acid Sequence (domain in AA coordinates: 107-176) 

MPLEAVVYPQDPFGYLSNCKDFMFHDLYSQEEFVAQDTKNNIDKLGHEQSFVEQGKEDDH 

QWRDYHQYPLLIPSLGEELGLTAIDVESHPPPQHRRKRRRTRNCKNKEEIENQRMTHIAV 

ERNRRKQMNEYLAVLRSIiMPS S YAQRGDQAS IVGGAINYVKELEHI LQSMEPKRTRTHDP 

KGDKTSTSSLVGPFTDFFSFPQYSTKSSSDVPESSSSPAEIEVTVAESHANIKIMTKKKP 

RQLLKLITSLQSLRLTLLHLWTTLHNSILYSISVRVEEGSQLNTVDDIATAI^ 
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QEET* 

>G1888 (1..729) 

ATGAAGATTTGGTGTGCTGTTTGTGATAAAGAAGAAGCTTCGGTGTTTTGTTGTGCGGAT 
GAAGCAGCTCTTTGTAATGGTTGCGATCGCCATGTTCATTTCGCCAATAAACTAGCCGGG 
AAACATCTCCGGTTCTCTCTGACTTCTC 

TGCGGGGAGAGGCGTGCATTATTATTTTGCCAAGAAGACAGAGCAATACTATGCAGAGAA 

TGTGACATTCCAATACATCAAGCTAATCAGCACACTAAGAAACACAATAGATT 

ACCGGCGTTAAGATCTCTGCCTCCCCGTCAGCCTACCC^GAGCCTCCAATTCCAACTCT 

GCTGCTGCATTTGGTCGAGCCAAAACCCGACCAAAATCAGTATCGAGCGAGGTCCCGAGC 

TCGGCCTCCAATGAGGTATTTACGAGCTCTTCTTCGACGACCACGAGCAATTGCTATTAT 

GGGATAGAAGAAAACTACCATCACGTGAGCGATTCGGGGTCGGGATCGGGTTGTACAGGT 

AGTATATCCGAGTATTTGATGGAGACATTACCGGGTTGGAGAGTGGAGGATTTGCTTGAA 

C^CCCTTCTTGTGTCTCCTATGAGGATAACATTATTACTAATAACAATAACAGTGAGTCT 

TATAGGGTTTATGATGGTTCTTCACAATTCCATCATCAAGGGTTTTC 

TTCTCTTGA 

>G1888 Amino Acid Sequence (domain in aa coordinates: 5-50) 
MKIWCAVCDKEEASWCCADEAALCNGC^RHVHFANKIiAGKHL 

CGERRALLFCQEDRAILCRECDIPIHQANEHTKKHNRFLLTGVKISASPSAYPRASNSNS 
AAAFGRAKTRPKSVSSEVPSSASNEVFTSSSSTTTSNCYYGIEENYHHVSDSGSGSGCTG 
S I SE YLMETLPGWRVEDLLEHPS CVS YEDNI ITNNNNSES YRVYDGS SQFHHQGFWDHKP 
FS* 

>G189 (34.. 987) 

CCACAACTCTCTCCTTGTAGAGAGAGAGATTTTATGGCGGTGGAGCTCATGACTCGGAAT 
TACATCTCCGGCGTCGGAGCTGATAGCTTCGCCGTTCAAGAAGCAGCTGCTTCAGGACTC 
AAAAGTATCGAAAATTTCATCGGTTTAATGTCTCGTGATAGCTTTAACTCTGATCAGCCA 
TCTTCTTCTTCCGCCTCCGCCTCCGCCTCCGCCGCCGCAGATCTTGAATCAGCTCGTAAC 
ACAACGGCGGACGCGGCTGTTTCAAAGTTTAAAAGAGTCATATCTCTCTTAGATCGAACT 
CGAACCGGACACGCCCGGTTTAGACGTGCTCCGGTTCATGTTATTTCTCCGGTTCTTTTA 
CAAGAAGAACCAAAAACGACGCCGTTTCAGTCTCCTCTTCCTCCTCCGCCGCAAATGATC 
CGAAAAGGTTCGTTTTCTT.CATCGATGAAAACGATTGATTTCTCATCTCTCTCCTCTGTA 
ACAACGGAATCAGACAACCAGAAGAAGATTCATCATCATCAACGTCCCTCTGAAACGGCG 
CCGTTTGCGTCTCAAACTCAAAGCCTCTCCACGACGGTCTCGTCTTTCTCAAAATCAACA 
AAGAGAAAATGTAACTCTGAGAATCTTCTCACCGGAAAATGCGCTTCCGCTTCTTCCTCC 
GGTCGTTGTCATTGCTCGAAGAAAAGAAAGATAA7VACAGAGGAGAATAATTAGGGTTCCG 
GCGATAAGTGCAAAAATGTCCGATGTACCACCGGACGATTATTCATGGAGGAAATACGGA 
CAAAAACCAATTAAAGGATCTCCACATCCAAGAGGATATTATAAGTGTAGTAGCGTAAGA 
GGTTGTCCAGCACGTAAACATGTTGAGAGAGCAGCTGATGATTCGTCCATGTTGATTGTT 
ACTTATGAAGGAGATCATAATCATTCTCTCTCCGCCGCTGATCTCGCCGGAGCCGCCGTT 
GCTGATCTTATTTTGGAATCGTCTTGAAAAGAACAAATCTTTATTTAAGGCTTTTATAAT 
ATAAATTTAGATCCTTACTTAGTGAAGTACTCAAACTATGAATGAAATCAATGTAATCAA 
AATCAAAAAGCTTTTGCTAAAAAAAAAAAAAAAAA 

>G189 Amino Acid Sequence (domain in AA coordinates: 240-297) 
MAVELMTRNYISGVGADSFAVQEAAASGLKSIENFIGLMSRDSFNSDQPSSSSASASASA 
AADLESARNTTADAAVSKFKRVI SLIjDRTRTGHARFRRAPVHVI S PVLLQEEPKTTPFQS 
PLPPPPQMIRKGSFSSSMKTIDFSSLSSVTTESDNQKKIHHHQRPSETAPFASQTQSLST 
TVSSFSKSTKRKCNSENLLTGKCASASSSGRCHCSKKRKIKQRRIIRVPAISAKMSDVPP 
DDYSWRKYGQKPIKGSPHPRGYYKCSSVRGCPARKHVERAADDSSMLIVTYEGDHNHSLS 
AADLAG AAVADL I LBS S * 
>GX939 (92.. 844) 

AATCATTAGCTTCTTeTCTTCTCTTCTCTCACAGAGAGAGTAATCACAAGCCAAGTGAGA 
AAAAGAAAACACTi\AACCCAGATCGAAAACCATGTCTATTAACAACAACAACAACAACAA 
CAACAATT^CAACGATGGTCTTATGATCTGATCAAACGGAGCTTTAATCGAACAACAACC 
ATCAGTCGTTGTGAAGAAACCACCGGCGAAAGATCGACATAGCAAAGTCGATGGAAGAGG 
GAGAAGAATCCGTATGCCGATTATATGTGCTGCTCGTGTTTTTCAGCTAACGAGAGAGCT 
TGGTCATAAGTCAGATGGCCAAACAATTGAATGGTTACTTCGTCAAGCAGAGCCTTCTAT 
TATAGCTGCAACAGGAACTGGTACAACTCCAGCGAGTTTCTCAACTGCTTCTGTCTCTAT 
CCGTGGAGCCACCAATTCTACTTCTTTAGATCATAAACCCACTTCTTTACTTGGTGGTAC 
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GTCACCGTTTATACTTGGGAAACGTGTTAGAGCTGATGAGGATAGTAATAATAGTCATAA 
TCATAGTTCTGTTGGTAAAGATGAGACCTTTACX3ACAACACCAGCTGGGTTTTGGGCTGT 
TCCGGCGAGGCCGGATTTTGGACAAGTTTGGAGTTTTGCTGGAGCTCCACAAGAGATGTT 
TTTACAACAACAACATCATCATCAGCAACCATTGTTTGTTCA^ 

AGCTGCAATGGGTGAAGCTTCTGCTGCTAGAGTTGGGAATTATCTTCCGGGTCATCTTAA 

TTTGCTTGCTTCTTTATCCGGTGGATCTCCCGGGTCGGATCGAAGAGAGGAAGATCCACG 

TTAATGGTTTAAGCCCTTTTAGGTTTGAGGGCAAAATTTGGTATATATATTTA 

CTCTTCTCTATTGTTGT(^TTGTTTCTCTATGTGTGTGTTTTAGTGTTGTTAGAGATTGA 

TTTGGTTTCAGAATCTCTGCAAGTGATTTGAGAGTTTTCGTTAGCTTTAAG^ 

GACGGTTGTTTTTGATTAGGGTTAAATTAGGGTTTAAGAATCrrGTTGTTTTT 

AGATCGATTTCTTATCGGATCCAAGATTACTTTTAGGAAAAAAGGGAAAATTTCAGAAAC 

CACGGTGGTTTCTTTTCCTCTTTTTTTTTTTG 

>G1939 Amino Acid Sequence (domain in AA coordinates: 40-102) 
MS INNKNlJNNNimND 

ARVFQLTRELGHKSDGQTI EWLLRQAEPS I IAATGTGTTPAS FSTAS VS I RGATNSTSIjD 
HKPTSLLGGTSPFILGKRVRADEDSl^SHNHSSVGKDETFTTTPAGFWAVPARPDFGQVW 
SFAGAPQEMFIiQQQHHHQQPLFVHQQQQQQAAMGEASAARVGNYLPGHLNLLASL 

GSDRREEDPR* 
>G194 (192.. 1205) 

TCTTTCTTCTCTCTCTATCTCTCCTCTTTGAACCCTAAAAACTCTTTCTTTACAAGGATT 
GATCTTTTTGTATTTTTGATTTTGA 

TTTCTCTGTTTTTAAAGCCATTTGATAGATTGTTTCCGGTAAAGCTCAGCGAGAGAAGAA 
GAAGAACAACAATGGAGTTTACAGATTTCTCAAAGACGAGTTTTTACTACCCGTCGTCAC 
AAAGCGTTTGGGATTTCGGAGATTTAGCGGCGGCGGAGAGGCATTCTTTAGGGTTCATGG 
AGTTATTAAGTTCTCAGCAGCATCSU^GACTTTG 

TCCAAACGTCTCAACCGCAAACGCAAACGCAACCATCGGCGAAGCTGTCTTCAAGTATCA 
TTCAAGCTCCACCGTCAGAGCAATTAGTGACGTCAAAGGTGGAGTCTTTGTGTTCGGATC 
ATTTGTTGATAAACCCACCGGCGACTCCTAACTCGTCATCGATTTCGTCTGCTTCAAGCG 
AGGCTCTAAATGAAGAGAAACCGAAAACAGAAGACAATGAAGAAGAAGGAGGTGAAGATC 
AACAAGAG AAGAGT CATACTAAGAAACAGTTGAAAGC AAAGAAG AATAAT CAGAAGAGAC 
AGAGAGAGGCAAGAGTCGCATTCATGACAAAGAGTGAAGTTGATCATCTCGAAGATGGTT 
ATCGCTGGCGAAAATATGGTCAAAAAGCTGTCAAAAACAGTCCTTTTCCCAGGAGTTACT 
ACCGTTGCACAACGGCTTCATGTAACGTGAAGAAGAGAGTGGAGAGATCATTCAGAGATC 
CAAGCACTGTGGTTACAACCTACGAAGGTCAACACACTCACATTAGTCCACTCACGTCTC 
GTCCTATTTCCACTGGAGGTTTCTTCGGATCGTCAGGAGCTGCTTCGAGTCTCGGTAATG 
GTTGCTTTGGGTTTCCTATTGATGGCTCCACGTTAATCTCTCCTCAGTTCCAACAGCTTG 
TCCAATACCATCACCAAC^GCAGCAACAAGAACTCATGTCTTGTTTTGGAGGAGTCAACG 
AGTACCTTAATAGCCACGCTAATGAGTATGGTGATGATAATCGTGTGAAGAAGAGTCGAG 
TTTTGGTTAAAGATAATGGACTTCTGCAAGATGTTGTTCCGTCTCATATGTTGAAGGAAG 
AGTAGTAGTATATATATAGTCTTATAGTTTTAATCTAGTTTTTTTTTGTATAATTGTCTA 
AAAGAAACGGATCTTTTGTTCTGATGAAGAAGATGTTTTCTTATGGTTCTGAAATCGTAA 
GGTAATGATGATTGTACGAAGCCGAGAAAGTACTTGTGATTTTCACCATTGAATCACTAT 
AAATGTAATTTTTATTTACTGTGAAAAAAAAAAAAAAA 

>G194 Amino Acid Sequence (domain in AA coordinates: 174-230) 

MEFTDFSKTSFYYPSSQSVVTOFGDIiAAAERHSLGFMELLSSQQHQDFATVSPHSFLLQTS 

QPQTQTQPSAKLSSSIIQAPPSEQLVTSKVESLCSDHLLINPPATPNSSSISSASSEALN 

EEKPKTEDNEEEGGEDQQEKSHTKKQLKAKKNNQKRQREARVAFMTKSEVDHLEDGYRWR 

KYGQKAVKNS PFPRS YYRCTTASCNVKKRVERS FRDPSTVVTTYEGQHTHI SPLTSRP I S 

TGGFFGSSGAASSLGNGCFGFPIDGSTIilSPQFQQLVQYHHQQQQQELMSCFGGVNEYLN 

SHANEYGDDNRVKKSRVLVKDNGLLQDVVPSHMLKEE* 

>G1943 (137.. 1858) 

ACATTTGTTTCTAATCTCAGACATAAATAATTTTTGTTCCCGACTT 

ATTATATCATTCCACATTCATTTTCTTCTACTTCTTCCTTCTCCTTGATCTCATTTCCCT 

AGAAAATCCATCTATCATGGGTGAAGATGATATAGTGGAGCTCTTATGGAAGAGTGGCCA 

AGTCGTTAGAACCAGTCAAACACAGAGACCCTCCTCCAATACACCACCATCTCTTCCTCC 

ACCACCCATTCTTCGTGGTAGCGGAAGCGGCAACGGAGAAGAAAATGCCCCGCTTCCACT 

TCCACAGCCTTCACCTCCCCTC(^TCATCAGAATCTTTTC^TTCIWAAGACGAAATGTC 
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TTCTTGGCTTCACCATTCTCACCCCGGCGTTACGTCCACCCCGGCTTCTTCTGTCTCCCT 
GCCACCACCACCCAATGCTCCGCGTGAAGATGATAT^ 

CCAAGTAGTTGGAACCAACCAAACACATAGACAATCCTACGATCCTCCTCCCATTCTCCG 

CGGCAGCGGAAGTGGCAGAGGAGAAGAAAATGCTCCCCTTTCACAACCTCCGCCTCACCT 

GCATCAGCAAAATCTCTTCATTC^GAAGGCGAAATGTATTCGTGGCTACACCATTCTTA 

CCGCCAAAACTATTTCTGCTCAGAACTTCTCT^ACTCCACTCCGGCTACTCACCCGCAAAG 

TTCCATCTCTCTGGCACCACGTCAGACTATCGCCACGAGAAGGGCGGAA^ 

CTTCTCGTGGCTAAGAGGGAACATATTTACCGGCGGTAGAGTTGATGAAGCTGGACCGTC 

GTTTTCGGTGGTAAGAGAATCGATGCAGGTAGGCTCGAACACGACCCCCCCTTCTTCTTC 

TGCCACTGAATCATGTGTAATACCAGCTACAGAGGGCACCGCGAGTCGAGTGTCGGGAAC 

TTTGGCAGCTCATGATCTTGGTCGGAAGGGAAAGGCGGTGGCGGTTGAGGCGGCCGGAAC 

ACCATCTTCAGGAGTGTGCAAGGCCGAAACAGAGCCGGTTCAGATACAACCAGCAACGGA 

GTCGAAGCTAAAAGCGAGAGAAGAAACCCATGGAACTGAAGAAGCTCGTGGTTCAACGTC 

TAGAAAGAGATCACGAACTGCAGAAATGCATAACCTCGCCGAAAGGAGAAGGAGAGAAAA 

GATCAACGAGAAGATGAAGACTCTGC7UVCAACTCATTCCTCGCTGCAACAAGGTTGAATC 

TGATTCTGTTTCTACTCTGATCAGTCTACTAAAGTTTCAACGCTGGATGATGCTATCGAG 

TACGTCAAATCGTTACAGAGCCAAATACAAGTATGCTCTTCAAAACAGAATGTGTTTTAA 

ACCAATGGTTCAACATGGAAAGAGTTCATATGTATCTAGTTTTGTTGAGATGATGTCGAC 

GGGACAGGGTATGATGTCGCCAATGATGAATGCCGGGAATACGCAACAGTTCATGCCCCA 

TATGGCCATGGATATGAACCGACCTCCTCCATTCATACC^^ 

TATGCCGGCTCAAATGGCAGGTGTAGGTCCATCATATCCAGCACCGCGCTACCCTTTTCC 
CAACATTCAGACCTTTGACCCATCCAGAGTCCGTTTACCAAGCCCGCAGCCTAACCCGGT 
GTCGAACCAGCCTCAGTTTCCGGCTTACATGAATCCCTATAGCCAGTTTGCTGGTCCCCA 
CCAGTTGCAACAACCTCCTCCTCCTCCATTTCAGGGTC 

CGGGCAGGCAAGTAGTAGCAAGGAACCTGAGGATCAGGAGAACCAACCAACAGCTTAGTT 
AAAGTGTGGAGCTGAAACGGATCAGTTCTTCAAGCAAATTACAACTTTGAAGATAAACCA 
GAGTTGTAACATGTAGATTTTGTCTGTTAAGTTTAATGTAAGTACTTTTTAGTTAATGGG 
AAAGATACTGACAGGTTGCAAGGTGGTCAGTATTTGTGCATCACGCTTAAGATTCCTCGA 
TGTGGCCAGTATCTCCCTTTTCTAGCATGTGAGGTCCCTACTCTCTGGTTCTACGGAGAC 
CAAATGTTCGACTGATTAAACACACAATGACTTACCAAAAGTACACGCGGCCCATCCTCG 
TCTTTATGTTCCAAGTGCGACTGTTTGTTTATTTGTAAGCATTTTTCTTATAATAATAAA 
ACAGGTCTATCTTCGTTAAAAAAAA 

>G1943 Amino Acid Sequence (domain in AA coordinates: 335-406) 
MGEDDIVELLWKSGQWRTSQTQRPSSNTPPSLPPPPILRGSGSGNGEENAPLPLPQPSP 
PLHHQNLFILEDEMSSWIJfflSHPGVTSTPASSVSIiPPPPNAPREDDIVEIiLWQSGQVVGT 
NQTHRQSYDPPPILRGSGSGRGEENAPLSQPPPHLHQQNLFIQEGEMYSWLHHSYRQNYF 
CSELLNSTPATHPQSS ISLAPRQTI ATRRAENFMNFSWLRGNIFTGGRVDEAGPS FSWR 
ESMQVGSNTTPPS SSATE SCVI PATEGTASRVSGTLAAHDLGRKGKAVAVEAAGTPS S GV 
CKAETEPVQIQPATESKLKAREETHGTEEARGSTSRKRSRTAEMHNLAERRRREKINEKM 
KTLQQLIPRCNKVESDSVSTLISLLKFQRWMMLSST^ 

GKS S YVS S F VEMMSTGQGMMS PMl^AGNTQQFMPHMAMDMNRPPPF I PFPGTS FPMPAQM 
AGVGPSYPAPRYPFPNIQTFDPSRVRLPSPQ5NPVSNQPQFPAYMNPYSQFAGPHQLQQP 
PPPPFQGQTTSQLSSGQASSSKEPEDQENQPTA* 
>G21 (79.. 966) 

TGTGGAGGAATATTAATAC AGC CCACTTCACATCTATTTTTGTG C AAC CATCTCTCTAAA 
GCTTCTTCTCTCATAACAATGGCAAGACAAATCAACAT^ 

ACCTTTATCTCCTCCGCCATCCCCGCCGTATCTTCCTCCTCCTCCATCACCGCTTCCGCC 

TCATTGTCCTCTTCACCTACTACATCTTCCTCTTCTXCGTCATCAACAAATTCT 

ATTGAGGAAGACAACTCTAAAAGAAAAGCATCTCGAAGATCATTGTCATCGTTAGTCTCC 

GTTGAAGACGATGATGATCAAAACGGTGGAGGTGGGAAACGGCGAAAGACCAACGGTGGA 

GATAAACATCCGACGTATAGAGGAGTGAGGATGAGGAGTTGGGGAAAATGGGTGTCGGAG 

ATTAGAGAGCCGAGAAAGAAATCAAGAATCTGGCTCGGGACTTATCCAACGGCTGAGATG 

GCAGCTCGAGCTCATGACGTAGCGGCTTTAGCCATTAAAGGTACAACGGCTTACCTCAAT 

TTTCCCAAGTTAGCCGGCGAGCTTCCTCGTCCAGTCACAAATTCTCCTAAAGACATTCAA 

GCCGCCGCCTCTTTAGCGGCCGTTAACTGGCAAGATTCGGTCAACGATGTGAGTAATTCT 

GAAGTGGCTGAAATAGTTGAAGCCGAGCCGAGTCGAGCCGTGGTGGCTCAGTTGTTTTCT 

TCGGACACAAGCACGACGACGACGACTCAGAGTCAAGAGTATTCGGAAGCTTCGTGTGCT 
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TCGACTTCGGCGTGTACGGACAAAGACAGTGAGGAAGAGAAGCTGTTTGATTTGCCGGAT 

TTGTTTACCGATGAGAATGAGATGATGATACGAAACGATGCGTTTTGCTACTACTCGTCC 

ACGTGGCAGCTTTGTGGAGCCGATGCTGGGTTTCGGCTTGAAGAGCCGTTTTTTCTATCT 

GAATGACTAAAGTACCCCTCTCGAGAGAGCTCTCACTAACACT 

>G21 Amino Acid Sequence (domain in AA coordinates: 97-164) 

MARQINIESSVSQVTFISSAIPAVSSSSSITASASLSSSPTTSSSSSSSTNSNFIEEDNS 

KRKASRRSLSSLVSVEDDDDQNGGGGKRRKTNGGDKHPTYRGVimRSWGKWVSEIREPRK 

KSRI WLGTYPTAEMAARAHD VAALAI KGTTAYLNFPKLAGELPRP VTNSPKD I QAAASLA 

AVNWQDSVl^VSNSEVAEIVEAEPSRAWAQLFSSDTSTTTTTQSQEYSEASCASTSACT 

DKDSEEEKLFDLPDLFTDENEMMIRNDAFCYYSSTWQLCG7UDAGFRLEEPFFLSE* 

>G2132 (42.. 1031) 

ATTCTGTTACTTAGTACCGGAGTTTAGTCGGAGAGAGAACAATGATCAGTTTCAGAGAAG 

AGAACATCGATCTCAACTTGATTAAAACAATTAGTGTAATCTGTAATGATCCAGACGCCA 

CCGATTCCTCTAGCGACGATGAATCTATCTCCGGCAATAATCCTCGCCGTCAGATCAAAC 

CAAAACCACCGAAACGTTACGTCTCAAAGATCTGTGTCCCGACGCTGATCAAAAGGTATG 

AGAACGTTTCGAATTCTACAGGGAATAAAGCAGCCGGAAACCGGAAAACGTCGTCGGGTT 

TCAAAGGCGTACGACGGAGGCCGTGGGGGAAATTTGCGGCGGAGATAAGAAATCCGTTTG 

AGAAGAAGAGAAAGTGGCTTGGAACGTTTCCTACTGAAGAAGAAGCAGCAGAAGCTTACC 

AAAAGAGTAAAAGAGAGTTTGATGAACGATTGGGTTTAGTTAAACAGGAAAAAGACCTAG 

TAGATTTGACCAAGCCGTGCGGTGTACGTAAACCAGAAGAGAAGGAAGTTACTGAGAAGT 

CGAATTGCAAAAAGGTAAATAAGAGAATTGTTACTGATCAGAAGCCATTTGGTTGTGGTT 

ATAACGCTGATCATGAAGAAGAGGGAGTGATTAGTAAAATGTTGGAAGATCCGTTGATGA 

CATCGTCAATTGCTGATATTTTTGGTGATTCGGCTGTTGAAGCAAATGATATTTGGGTGG 

ATTACAATTCAGTGGAATTTATTTCCATTGTAGATGATTTCAAGTTTGATTTTGTGGAGA 

ATGATAGAGTAGGAAAGGAGAAAACATTTGGATTTAAGATTGGGGATCACACTAAAGTTA 

ATCAACATGCCAAAATCGTATCGACCAATGGGGACTTATTCGTCGATGATTTACTTGATT 

TTGATCCGTTGATAGATGATTTTAAGTTAGAAGATTTTCCTATGGATGATCTTGGATTAT 

TAGGAGATCCAGAGGATGATGATTTTAGTTGGTTTAATGGTACTACTGATTGGATCGATA 

AGTTTTTATGAATACTTTCTTGACACGGCCAACGGTATTAGTAC 

>G2132 Amino Acid Sequence (domain in AA coordinates: TBD) 

MISFREENIDLNLIKTISVICNDPDATDSSSDDESISGNNPRRQIKPKPPKRYVSKICVP 

TLIKRYENVSNSTGNKAAGNRKTSSGFKGVRRRPWGKFAAEIRNPFEKKRKWL 

EAAEAYQKSKREFDERLGLVKQEKDLVDLTKPCGVRKPEEKEVTEKSNCKKVNKRIVTDQ 

KPFGCGYNADHEEEGVISKMLEDPLMTSSIADIFGDSAVEANDIWVDYNSVEFISIVDDF 

KFDFVENDRVGKEKTFGFKIGDHTKVNQHAKIVSTNGDLFVDDLLDFDPLIDDFKLEDFP 

MDDLGLLGDPEDDDFSWFNGTTDWIDKFL* 

>G2145 (1..777) 

ATGGACGTTTTTGTTGATGGTGAATTGGAGTCTCTCTTGGGGATGTTCAACTTTGATCAA 

TGTTCATCATCTAAAGAGGAGAGACCGCGAGACGAGTTGCTTGGCCTCTCTAGCCTTTAC 

AATGGTCATCTTCATC^CATCAACACC^TAACAATGTCTTATCTTCTGATC^TCATGCT 

TTCTTGCTCCCTGATATGTTCCCATTTGGTGCAATGCCGGGAGGAAATCTTCCGGCCATG 

CTTGATTCTTGGGATGAAAGTCATCACCTCCAAGAAACGTCTTCTCTTAAGAGGAAACTA 

CTTGACGTGGAGAATCTATGCAAAACTAACTCTAACTGTGACGTCACAAGACAAGAGCTT 

GCGAAATCCAAGAAAAAACAGAGGGTAAGCTCGGAAAGCAATACAGTTGACGAGAGCAAC 

ACTAATTGGGTAGATGGTCAGAGTTTAAGCAACAGTTCAGATGATGAGAAAGCTTCGGTC 

ACAAGTGTTAAAGGCAAAACTAGAGCCACCAAAGGGACAGCCACTGATCCTCAAAGCCTT 

TATGCTCGGAAACGAAGAGAGAAGATTAACGAAAGGCTCAAGACACTACAAAACCTTGTG 

CCAAACGGGACAAAAGTCGATATAAGCACGATGCTTGAAGAAGCGGTCCATTACGTGAAG 

TTCTTGCAGCTTCAGATTAAGTTGTTGAGCTCGGATGATCTATGGATGTACGCACCATTG 

GCTTACAACGGCCTGGACATGGGGTTCCATC^CAACCTTTTGTCTCGGCTTATGTGA 

>G2145 Amino Acid Sequence (domain in AA coordinates : 166-243) 

MDVFVDGELESLLGMFNFDQCSSSKEERPRDELLGLSSLYNGHLHQHQHHNNVLSSDHHA 

FLLPDMFPFGAMPGGNLPAMLDSVTOQSHHLQETSSLKR^ 

AKSKKKQRVSSESNTVDESNTNWVDGQSLSNSSDDEKASVTSVKGKTRATKGTATDPQSL 
YARKRREKINERLKTLQNLVPNGTKVDISTMLEEAVHYVKF 
AYNGLDMGFHHNLLSRLM* 
>G23 (22.. 732) 
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TATCAAACGAGAGTACAAAAGATGACGTCACTCAACAGCTCTGCATCACCAACATCATCG 
TCATCAGACCAATCTGATGCAACTACTACAACAAGCACCCACTTGTCTGAAGAAGAAGCT 
CCACCCAGAAACAACAACACAAGAAAGAGAAGGAGAGATTCTTCTTCTGCTTCTTCATCT 
TCTTCAATGCAACATCCTGTTTACAGAGGTGTGCGGATGAGAAGTTGGGGCAAATGGGTC 
TCCGAGATCCGACAACCTCGTAAGAAAACTCGTATTTGGCTCGGCACTTTTGTCACCGCT 
GATATGGCTGCTCGTGCTCACGACGTCGCTGCTCTCACCATCAAAGGCTCCTCCGCCGTC 
TTAAATTTCCCTGAGCTTGCTTCTCTCTTCCCTCGTCCGGCGTCATCATCGCCGCATGAT 
ATCCAGACAGCCGCCGCAGAAGCCGCCGCCATGGTGGTCGAAGAAAAACTGTTAGAGAAG 
GATGAGGCTCCGGAGGCCCCACCTTCGTCGGAATCTTCTTACGTGGCGGCGGAGTCAGAG 
GATGAGGAGAGGTTGGAGAAAATTGTGGAGCTGCCTAACATTGAAGAAGGAAGTTATGAC 
GAGAGTGTGACATCACGTGCTGATCTGGCTTATTCTGAGCCGTTCGATTGTTGGGTGTAT 
CCTCCGGTTATGGATTTTTATGAAGAAATATCGGAGTTTAATTTCGTGGAATTGTGGAGC 
TTTAATCACTAATTAAGTTAGGAAAGTGCATTATATTGCAATATTGCATCATAGATAACA 
TTTGTATTTCTTTTCTTTTTGTACGGATACGTAGCATATGCTACTATACTAGGGCTAGTG 
TACCAAATATTGTAAAATATACTTATTAATATTTATGTAAATGTGTAATATATATAACAT 
ACAATTATTGTAAGTTTGGAAATTGGAAACTATCGTTACGCAATGTTCTTGTAAAAAAAA 

AAAAAAAAAA 

>G23 Amino Acid Sequence (domain in AA coordinates: 61-117) 

MTSLNSSASPTSSSSDQSDATTTTSTHLSEEEAPPR^ 

YRGVRMRSWGKWVSEIRQPRKKTRIWLGTFVTADMAAR^ 

SLFPRPASSSPHDIQTAAAEAAAMVVEEKLIiEKDEAPEAPPSSESSYYAAESEDEERLEK 

IVELPNIEEGSYDESVTSRADLAYSEPFDCWVYPPVMDFYEEI^ 

>G2313 (104.. 724) 

CGTCGACACAATCGCTCTTCCGTAACATATTCCACAAAACGATCTTCTTGTTTCTTGAAT 
TTTTAGCCATCTCTTTTTTTTTTTTCTCATTTTCTCGGATACTATGGCTTCGAGTCCACG 
CTGGACGGAGGACGACAACAGGCGTTTTAAGTCAGCTCTGTCGCAATTCCCTCCGGATAA 
CAAGCGTTTGGTGAATGTCGCCCAGCATCTGCCGAAACCTTTGGAGGAGGTGAAGTACTA 
CTACGAAAAGTTGGTCAACGATGTTTATCTGCCGAAACCTTTAGAGAATGTCACCCAGCA 
TCTGCAGAAACCTATGGAAATGGAGGAGATGAAGTACATGTACGAAAAGATGGCCAACGA 
TGTTAATCAGATGCCCGAGTACGTACCACTGGCGGAATCGAGTCAGTCCAAACGCAGGAA 
GAAGGATACGCC AAATCCTTG GACAGAAGAG GAACACAGATTGTTTCTGCAAGG ATTGAA 
AAAGTATGGGGAAGGAGCTTCGACGTTGACATCAACGAATTTTGTGAAGACAAAGACTCC 
ACGG(^^GTGTCAAGCCATGCACAGTATTACAAAAGGOVAAAATCGGACAATAAGAAGG^ 
GAAACGCCGGAGTATTTTTGACATAACTTTGGAGTCTACCGAGGGCAATCCAGATTCTGG 
AAATCAGAACCCTCCGGATGATGATGATCCGTCCCAAGGTCAAGGCACTTGTCTTGGAGT 
TTAGATGTTGGAAGATAGAAGAATGGTGTGAAAGC 

>G2313 Amino Acid Sequence (domain in AA coordinates: TBD) 

MASSPRWTEDDNRRFKSALSQFPPDNKRLVNVAQHLPKPL 

ENVTQHLQKPMEMEEMKYMYEKM;^ 

FLQGLKKYGEGASTLTSTNFVKTKTPRQVSSHAQYYKRQKSDNKKEKRRSIFDITLESTE 
GNPDSGNQNPPDDDD P SQGQGTCLGV* 
>G2344 (1..573) 

ATGACTTCTTCAATCCATGAGCTTTCTGATAACATTGGAAGTCATGAGAAGCAAGAACAG 

AGAGATTCTCATTTCCAACCACCAATCCCTTCTGCAAGAAATTATGAATCAATTGTTACA 

AGTTTAGTCTACTCAGACCCGGGGACTACAAATTCCATGGCACCTGGACAATATCCATAT 

CCAGATCCTTACTACAGAAGCATATTTGCACCGCCTCCACAACCGTATACCGGGGTACAT 

CTACAGTTGATGGGAGTGCAGCAACAAGGCGTTCCTTTACCATCTGATGCAGTCGAGGAA 

CCTGTTTTTGTTAAeGCAT^GCAATACCACGGTATACTAAGGCGCAGACAA^ 

AGACTTGAGTCTCAGAATAAAGTCATCAAGTCACGTAAGCCGTATTTGCATGAATCTCGG 

CATTTGCATGCGATAAGACGACCAAGAGGATGTGGCGGGCGGTTTCTAAATGCCAAGAAG 

GAGGATGAGCATCACGAAGACAGTAGTCATGAAGAAAAATCCAACCTTAGCGCTGGTAAA 

TCCGCCATGGCTGCTTCTAGTGGTACATCTTGA 

>G2344 Amino Acid Sequence (domain in AA coordinates: TBD) 
MTSSIHELSDNIGSHEKQEQRDSHFQPPIPSARNYESIVTSLVYSDPGTTNSMAPGQYPY 
PDPYYRS I FAPPPQPYTGVHLQLMGVQQQGVPLP SDAVEEPVFVNAKQYHGI LRRRQSRA 
RLESQNKVIKSRKPYLHESRHLHAIRRPRGCGGRFI^AKKEDEHHEDSSHEEKSNLSAGK 

SAMAASSGTS* 
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>G2430 (69.. 1907) 

AACTTC^CATACAC^TAATCTCTCACTTAT^AAATATCTCTCTCTCTCTCTCTACAAAAT 

CAATTCCAATGTTGGTGGGAAAGATAAGTGGATATGAAGATAATACTCGCTCTTTGGAGC 

GAGAAACATCTGAAATCACTTCTCTTCTCAGCCAATTTCCGGGGT^TACTAATGTCCTTG 

TTGTTGACACCAATTTCACCACTCTACTC^CATGAAACAAATCATO 

ATCAAGTGTCTATTGAGACAGATGCAGAAAAAGCTCTTC 

ATGAAATCAATATTGTGATTTGGGATTTTCATATGCCTGGAATTGATGGACTTCAAGCTC 
TCAAGAGCATTACTTCAAAGTTGGATTTACCTGTAGTGATTATGTCTGATGATAATCAAA 
CGGAATCTGTGATGAAAGCAACATTTTACGGTGCTTGTC 

AAGAAGAGGTAATGGCCAATATATGGCAAC^CATTGTACGGAAGAGGCTGATCTTTAAAC' 
CGGATGTTGCTCCACCGGTTCAATCAGATCCGGCTCGCTCTGACCGTTTAGACCAAGTCA 
AAGCTGATTTGAAGATCGTAGAAGATGAACCAATAATCA^ 

GGACCGAAGAAATTCAACCGGTT(^GTCAGATCTGGTTCAAGCCAACAAGTTCGACCAAG 
TGAATGGCTATTCCCCAATCATGAACCAAGATAA 

CGCGAATGACGTGGACAGAAGTTATTCAACCGGTTCAATCAAATCTGGTTCAAACAAAAG 

AGTTCGGCCAACTCAATGACTATTCCCAAATCATGAACCAAGATAGCATGTACAACAAAG 

CAGCAACCAAACCACAATTGACGTGGACCGAAGAAATTCAACCGGTTCAATCAGGTCTG^ 

TTCAAGCCAACGAGTTCAGCAAAGTGAATGGATATTCCCAAAGCATGAACCAAGAT^ 

TGTTCAACAAATCAGCAACCAACCCGCGATTGACATGGAACGAATTACTTCAACCGGTTC 

AATCAGATCTGGTTCAATCCAATGAGTTTAGCCAATTGAGTGACTATTCTCAAATCATGA 

ACGAAGATAACATGTTCAACAAAGCAGCAAAGAAACCGCGGATGACATGGAGTGAAGTAT 

TTCAACCGGTTCAATCACATCTGGTTCCGACTGACGGTTTAGACCGAGACCACTTTGATT 

CCATAACCATAAACGGAGGTAACGGC^TACAAAAC^TGGAAAAGAAACAAGGAAAAAAAC 

CACGGAAGCCGCGGATGACGTGGACCGAAGAGCTTCACCAAAAATTTCTGGAAGCCATCG 

TVAATAATTGGTGGTATCGAAAAAGCTAACCCAAAGGTACTTGTCGAATGCTTGCAAGAAA 

TGAGGATAGAAGGAATTACTAGAAGCAATGTGGCAAGTCATCTTCAGAAACACCGTATCA 

ATCTTGAAGAAAACCAAATTCCTCAACAAACACAAGGGAATGGTTGGGCCACTGCGTATG 

GTACACTAGCTCCCTCTCTCCAAGGTTCAGACAATGTCAACACAACAATACCATCGTACC 

TTATGAATGGTCCAGCCACTTTGAACCAAATCCAGCAGAATCAATATCAAAATGGTTTCT 

TGACAATGAACAACAACCAGATCATAACCAATCCTCCGCCTCCTTTGCCCTATTTGGACC 

ATCAT(^CC^CAGCAA(^TCAGTCTTCTCCTCAATTTAATTACCTGATGAA(^TGAAG 

AACTTCTTCAAGCCTCTGGCCTCTCTGCGACAGATCTTGAACTCACTTATCCAAGTTTAC 

CATATGATCCACAAGAGTATCTAATCAATGGCTACAATTATAATTAGTCATATAGCCCTT 

CTCTTTACTTAAGGCAGTCTATGTATGACAAATAATATGCGACTTCCCTTGTGAGTCACA 

ATATTGTTTCATTATTC 

>G2430 Amino Acid Sequence (domain in AA coordinates :425-478) 
MLVGKISGYEDNTRSLERETSEITSLLSQFPGNTNVLVVDTNFTTLLNMKQIMKQYAYQV 
S I ETDAEKALAFLTSCKHE INI VIWDFHMPGIDGLQALKS ITSKLDLP WIMSDDNQTE S 
VMKATF YGACDYWKPVKEEVMANI WQH IVRKRL I FKPDVAPPVQSDPARSDRLDQVKAD 
FKIVEDEPIINETPLITWTEEIQPVQSDLVQANKFDQWGYSPIMNQDNMFNKAPPKPRM 
TWTEVI QPVQSNLVQTKEFGQLNDYS QIMNQDSMYNKAATKPQLTWTEE I QPVQSGLVQA 
NEFSKVNGYSQS^QDSMFNKSATNPRLTWN^ 
NMFNKAAKKPRMTWSEVFQPVQSHLVPTDGLDRPHFM 

PRTTTWTEELHQKFLEAIEIIGGIEKANPKVLVECLQEMRIEGITRSNVASHLQKHRINLE 
ENQ I PQQTQGNGWATAYGTLAPSLQGSDNVNTTI P S YLMNGPATLNQ I QQNQ YQNGFLTM 
NNNQIITNPPPPLPYLDHHHQQQHQSSPQFNYLMl^ELLQASGLSATDLELTYPSLPYD 

PQEYLINGYNYN* 
>G2517 (66.. 899) 

TCCTCACTCTCTCTCTTTTTCTCTAACCATAAAATCTCTTTGATCTCTTTCTCTGTGTTT 
TGATAATGGAAAATGTTGGTGTTGGGATGCCGTTTTACGATTTAGGGCAAACAAGGGTTT 
ACCCACTCTTGTCTGATTTCCACGATTTATCGGCGGAGAGGTATCCGGTAGGGTTCATGG 
ATTTACTGGGTGTTCATCGTCATACACCCACCCATACGCCGTTGATGCATTTTCCGACCA 
CACCTAACTCGTCCTCGAGCGAAGCTGTGAATGGAGATGACGAAGAAGAAGAAGATGGAG 
AAGAACAGCAGCATAAGACAAAGAAGCGGTTTAAATTCACTAAAATGAGTAGAAAGCAGA 
CGAAGAAGAAGGTGCCAAAAGTGTCATTCATCACGAGGAGTGAGGTTCTTCATCTAGATG 
ATGGTTATAAGTGGAGAAAATACGGTCAAAAACCTGTCAAAGACAGCCCTTTTCCAAGAA 
ATTATTACCGTTGCACAACAACTTGGTGTGACGTGAAGAAGAGAGTAGAGAGATCATTCA 
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GTGATCCAAGCAGTGTAATCACCACTTACGAAGGTCAACATACTCATCCTCGTCCACTAC 
TCATC^TGCCCAAAGAAGGCAGCTCTCCATCCAATGGCTCAGCTTCTAGGGCCCACATTG 
GCCTCCCTACACTCCCTCCTCAGCTTTTAGATTACAACAACCAAC^CAAC^GCGCCGT 
CTTCTTTTGGAACCGAGTACATTAACAGGCAAGAAAAAGGAATTAATCATGATGATGATG 
ACGATCATGTTGTGAAGAAGAGTCGAACTCGGGATCTGCTGGATGGAGCTGGTTTAGTCA 
AAGATCATGGCCTTCTTCAGGATGTTGTTCCCTCTCATATCATTAAGGAAGAGTATTAGT 
TAATCGCATAATTATGTAGCTAGCTAGCTAG 

>G2517 Amino Acid Sequence (domain in AA coordinates: TBD) 

MHWGVGMPFYDLGQTRVYPLLSDFHDLSAERYP 

NSSSSEAVNGDDEEEEIXitfSEQQHKTKKRFKFTK^ 

YKWRKYGQKPVKDS PFPRNYYRCTTTWCDVKKRVERS FSDPS SVITTYEGQHTHPRPLLI 
MPKEGSSPSNGSASRAHIGLPTIjPPQLLDYNNQQQQAPSSFGTEYINRQEKGINHDDDDD 
HVVKKSRTRDLLDGAGL VKDHGLLQDVVPSHI I KEE Y * 
>G2521 (103.. 768) 

ATTCTCCACAATTTCATAACTTTCTTCCGCTC 

TCTTTCAATACGACTGCGGAGAT(^GAGCCAATTATTTGGTTATGGCGTCTOTGATCTCA 
GATATTGAACCGCCGACGAGTACTACTTCAGATCTCGTTCGGAGAAAGAAGAGATCCTCT 
GCTTCATCCGCCGCATCGTCTCGTTCAAGCGCATCTTCCGTCTCCGGTGAGATTCACGCG 
CGATGGCGATCGGAGAAGCAACAACGGATCTACTCAGCCAAACTGTTCCAAGCGCTCCAA 
CAAGTCCGCCTCAACTCTTCCGCCTCAACATCATCATCTCCAACGGCTCAGAAACGAGGA 
AAGGCCGTCCGTGAAGCCGCCGATCGAGCTCTTGCCGTTTCCGCTCGGGGAAGAACACTC 
TGGAGCAGAGCGATCTTAGCTAATCGGATCAAACTGAAATTTCGTAAACAGAGACGTCCT 
CGAGCTACGATGGCGATTCCGGCCATGACTACGGTGGTTAGTAGCAGCAGCAACAGATCG 
AGAAAACGGAGAGTGTCGGTGTTGAGATTGAATAAGAAGAGTATACCGGATGTTAACCGG 
AAAGTACGTGTTCTAGGCCGGTTAGTTCCCGGTTGCGGTAAACAATCCGTACCGGTGATT 
CTAGAAGAAGCAACTGATTATATTCAGGCTCTGGAGATGCAAGTGAGAGCCATGAACTCT 
TTAGTTCAGCTTCTCTCCTCCTACGGCTCAGCTCCTCCACCGATTTGATGAGGTTAAAAT 
CGTCTTTTTAATTCTACCATCTCTCGATCTTTCACAGCTTATGTGTATATAGAAGATTCG 
GTTTGATTATAATCTGTAACTACTCTTCCCAACCGCTGATTCTTCTCTGCTACAAGTAAA 
AGTAAATTTTGAACCGAGTCTTCCCATTTTTACGATCCTCAAGTCTAAATTAAGTATATG 
ATTGATTAATAAAGTCTTTACCATTAGGGTTC 

>G2521 Amino Acid Sequence (domain in AA coordinates: 145-213) 

MASLISDIEPPTSTTSDLVRRKKRSSASSAASSRSSASSVSGEIHARWRSEKQQRIYSAK 

LFQALQQVRLNSSASTSSSPTAQKRGKAWEAADRALAVSARGRTLWSRAILANRIKLK^ 

RKQRRPRATMAI PAMTTWS S S SNRS RKRRVS VLRLNKKS I PD VNRKVRVLGRLVPGCGK 

QSVPVILEEATDYIQALEMQVRAMNSLVQLLSSYGSAPPPI* 

>G258 (60.. 983) 

AGTGACCACCCTGCTGGTTAATCAACACCAAGAGACCTTGTAATATATAAGTTAGGAAGA 
TGAGAGAGAAGTGGGAAATGAAAAGAGATGAAATGGGACATCGATGTTGTGGAAAACACA 
AAGTGAAGAGAGGTCTTTGGTCTCCAGAGGAAGACGAGAAGCTTCTTCGTTATATCACCA 
CTCATGGTCATCCTAGTTGGAGTTCCGTTCCAAAGCTTGCCGGGTTGCAGAGATGTGGGA 
AGAGTTGCAGATTAAGGTGGATAAACTATCTAAGGCCTGATCTGAGGAGAGGTTCGTTTA 
ATGAGGAAGAAGAGCAGATTATCATCGACGTACATCGTATTCTTGGTAACAAATGGGCTC 
AGATTGCTAAGCACTTACCTGGACGC^CTGATAATGAAGTCAAGAACTTTTGGAACTCAT 
GCATTAAGAAGAAACTTCTTTCTCAAGGCTTAGATCCTTCTACACATAATCTTATGCCTT 
CACACAAAAGATCTTCTTCTTCAAACAATAATAATATCCCCAAGCCAAACAAAACGACGT 
CCATCATGAAGAACCCTACTGATCTTGATCAATCAACCACTGCTTTTTCAATCACAAACA 
TCAATCCACCCACTTCCACTAAACCAAACAAACTTAAATCTCCTAACCAGACTACAATCC 
CATCTCAAACCGTGATCCCTATCAATGATAACATGTCAAGTACTCAAACCATGATCCCTA 
TCAATGATCCCATGTeAAGTCTTTTAGATGATGAGAATATGATTCCTCACTGGTCAGATG 
TTGATGGAATGGCGATCCACGAAGCTCCGATGTTGCCTAGTGATAAGGCAGTAGTGGGAG 
TGGATGATGATGATCTCAACATGGACATTTTGTTTAACACTCCTTCTTCTTCTGCTTTTG 
ATCCTGATTTTGCTTCCATTTTCTCCTCTGCAATGTCTATCGATTTCAATCCCATGGATG 
ATCTTGGCAGCTGGACCTTTTAGCTTTTACTCTACAGC 

>G258 Amino Acid Sequence (domain in AA coordinates: 24-124) 
MREKWEMKRDEMGHRCCGKHKVKRGLWS 

KSCRLRWINYLRPDLRRGSFNEEEEQI I IDVHRILGNKWAQIAKHLPGRTDNEVKNFWNS 
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CIKKKLLSQGLDPSTHNLMPSHKRSSSSNNNN 

INPPTSTKPNKLKSPNQTTIPSQTVIPINDNMSSTQTMIPINDPMSSLLDDENMIPHWSD 
VDGMAIHEAPMLPSDKAWGVDDDDLN^ILFNTPSSSAFDPDFASIFSSAMSIDFNPMD 

DLGSWTF* 

>G280 (108.. 722) 

AAGTTAATATGAGAATAATGAGAAAACCACTTTCCCAAATTGCTTTTTAAAATCCCTCCT 

CACACAGATTCCTTCCTTCATCACCTCACACACTCT 

TCCACCATGGCTCAGCTTCAGATACX3CATTCA^ 

CTTATCCTCAGATGATAATGGAAGCGATTGAGTCCTTGAACGATAAGAACGGCTGCAACA 

AAACGACGATTGCTAAGCACATCGAGTCGACTCAACAAACTCTACCGCCGTC^CAC^TGA 

CGCTGCTCAGCTACCATCTCAACCAGATGAAGAAAACCGGTCAGCTAATCATGGTGAAGA 

ACAATTATATGAAACCAGATCCAGATGCTCCTCCTAAGCGTGGTCGTGGCCGTCCTCCGA 

AGCAGAAGACTCAGGCCGAATCTGACGCCGCTGCTGCTGCTGTTGTTGCTGCCACCGTCG 

TCTCTACAGATCCGCCTAGATCTCGTGGCCGTCCACCGAAGCCGAAAGATCCATCGGAGC 

CTCCCCAGGAGAAGGTCATTACCGGATCTGGAAGGCCACGAGGACGACCACCGAAGAGAC 

CGAGAACAGATTCGGAGACGGTTGCTGCGCCGGAACCGGCAGCTCAGGCGACAGGTGAGC 

GTAGGGGACGTGGGAGACCTCCGAAGGTGAAGCCGACGGTGGTTGCTCCGGTTGGGTGCT 

GAATTAATCGGTACTTATGCAATTTCGGAATCTTTAGTTACTGAAAAATGG 

GAGAGTAAGAGAGTGCTTTAATTTAGCTTAATTAGATTTATTTGGATTTCTTTCAGTATT 

TGGATTGTAAACTTTAGAATTTGTGTGTGTGTTGTTGCTTAGTCCTGAGATAAGATATAA 

CATTAGCGACTGTGTATTATTATTATTACTGCATTGTGTTATGTGAAACTTTGTTCTCTT 

GTTGAAAAAAAAAAAAAAAAAAA 

>G280 Amino Acid Sequence (domain in AA coordinates: 97-104,130-137-155-162,185- 
192) 

MAFDLHHGSASDTHSSELPSFSLPPYPQMIMEAIESLJTOKNGCNKTTIAKHIESTQQTLP 
PSHMTLLSYHLNQMKKTGQLIMVKNNYMKPDPDAPPKRGRGRPPKQKTQAESDAAAAAVV 
AATWSTDPPRSRGRPPKPKDPSEPPQEKVITGSGRPRGRPPKRPRTDSETVAAPEPAAQ 
ATGERRGRGRPPKVKPTWAPVGC * 
>G3 (16.. 477) 

GTTTGTCTTTTATCAATGGAAAGAGAACAAGAAGAGTCTACGATGAGAAAGAGAAGGCAG 
CCACCTCAAGAAGAAGTGCCTAACCACGTGGCTACAAGGAAGCCGTACAGAGGGATACGG 
AGGAGGAAGTGGGGCAAGTGGGTGGCTGAGATTCGTGAGCCTAACAAACGCTCACGGCTT 
TGGCTTGGCTCTTACACAACCGATATCGCCGCCGCTAGAGCCTACGACGTGGCCGTCTTC 
TACCTCCGTGGCCCCTCCGCACGTCTCAACTTCCCTGATCTTCTCTTGCAAGAAGAGGAC 
(^TCTCTCAGCCGCCACCACCGCTGACATGCCCGCAGCTCTTATAAGGGAAAAAGCGGCG 
GAGGTCGGCGCCAGAGTCGACGCTCTTCTAGCTTCTGCCGCTCCTTCGATGGCTCACTCC 
ACTCCGCCGGTAATAAAACCCGACTTGAATCAAATACCCGAATCCGGAGATATATAGTCA 

CATAGATACTGGAAAATATAGGTATGTATACATTCATAAATTATCTTATGTATCAAAGAA 
TTTTATAGATTCTGATTAGCTTTTTGTTTTTGTTTTTGATAAGAACTCTGATTAGTTGTC 
CGGAGACAAAACCGGCTAAGAGCAATCCATGAGAAGCTAGCGAGTGTTTTTTAGTTCAAG 
TTGTAATATAAATGCATATTAATTCTTTAGTAATTTTGT 

>G3 Amino Acid Sequence (domain in AA coordinates: 28-95) 

MEREQEESTMRKRRQPPQEEVPlSraVATRKPYRGIRRRKWGKWVAEIREPNKRSRLWLGSY 

TTDIAAARAYDVAVFYLRGPSARLOTPDLLLQEEDHLSAATTADMPAALIREKAAEVGAR 

VDALLASAAPSMAHSTPPVIKPDLNQIPESGDI* 

>G343 (1..795) 

ATGGACGTCTATGGCTTATCTTCACCAGACTTACTTCGAATCGACGACCTTCTTGATTTC 
TCCAACGAAGACATCTTCTCCGCTTCTTCTTCCGGTGGTTCCACCGCCGCTACTTCCTCT 
TCTTCTTTCCCTCCTeCTCAAAACCCTAGTTTCCACCACCACCATCTCCCTTCCTCCGCC 
GATCATC^CTCCTTCCTCCACGACATTTGCGTTCCCAGTGATGACGCAGCTCATCTTGAA 
TGGCTTTCGCAATTCGTGGACGATTCTTTCGCTGATTTTCCGGCGAATCCATTAGGAGGA 
ACTATGACTTCTGTCAAAACTGAAACTTCCTTTCCGGGGAAACCAAGAAG(^AACGATCA 
AGAGCTCCTGCTCCTTTCGCCGGAACATGGTCTCCGATGCCACTGGAATCCGAGCATCAG 
CAGCTTCACTCCGCCGCCAAATTCAAGCCAAAGAAAGAACAATCCGGCGGAGGAGGAGGA 
GGAGGAGGAAGACATCAGTCATCGTCATCGGAGACTACGGAAGGAGGAGGAATGAGGAGA 
TGTACTCACTGTGCATCGGAGAAAACGCCACAGTGGAGGACAGGACCACTTGGACCTAAA 
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ACACTATGTAACGCTTGTGGAGTCCGGTTTAAATCCGGTAGACTTGTACCGGAATATAGA 

GAGCTTCGACGGC7VGAAAGAAGTTATGAGACAAC(^CAAC^GTTCAACTTCATCACCAC 
CACCACCCGTTTTAG 

>G343 Amino Acid Sequence (domain in AA coordinates: 178-214) 

MDVYGIiSSPDLLRIDDLLDFSNEDIFSASSSGGSTAATSSSSFPPPQNPSFHHHHLPSSA 

DHHSFLHDICVPSDDAAHLEWLSQFVDDSFADFPANPLGGTMTSVKTETSFPGKPRSKRS 

RAPAPFAGTWSPMPLESEHQQLHSAAKFKPKKEQSGGGGGGGGRHQSSSSETTEG^MRR 

CMCASEKTPQWRTGPLGPKTLCNACGVRFKSGRIjVPEYRPASSPTFVLTO 

ELRRQKEVMRQPQQVQLHHHHHPF* 

>G363 (1..780) 

ATGAGACCAATATTAGACCTCGAAATTGAAGCTTCATCGGGCAGTAGTAGCAGCCAAGTG 

GCCTCAAACTTGTCTCCGGTTGGGGAAGATTACAAACCAATCTCGCTGAATCTO 

AGTTTCAACAACAACMCAACAATAATCTGGATCTTGAATC^TCGTCTTTGACGCTGCCA 

CTTTCGAGCACGAGTGAGAGTAGTAACCCGGAGCAGCAGCAGCAACAACAACCATCTGTA 

TCAAAGAGAGTCTTCTCTTGTAACTACTGCCA7UVGGAAGTTCTATAGCTCTCAAGCGCTA 

GGTGGTCACCAAAACGCTCACAAACGTGAGAGAACACTCGCCAAACGCGCTATGCTATGG 

GTCTTGCTGGGGTCTTCCCCGGTAGAGGATCAAGTAGCAATTATGCGGCTGCTGCCACAG 

CAGCCGCTCTCGTGTTTGCCGCTTCACGGAAGCGGAAACGGGAACATGACATCGTTCAGG 

ACTTTGGGAATCCGGGCACATTCCTCGGCGCACGACGTCAGCATGACAAGGCAGACACCA 

GAAACACTTATTAGAAACATTGCCAGGTTCAACCAGGGGTATTTCGGTAATTGTATACCT 

TTTTACGTGGAGGACGACGAGGCCGAGATGCTCTGGCCGGGGAGTTTCCGGCAAGCTACG 

AATGCGGTTGCGGTTGAAGCGGGTAATGATAATTTAGGTGAAAGAAAAATGGATTTCTTG 

GACGTCAAGCAAGCGATGGATATGGAAAGTTCTCTTCCAGATCTAACCTTGAAGCTTTGA 

>G363 Amino Acid Sequence (domain in AA coordinates; 87-108) 

MRPILDLEIEASSGSSSSQVASNLSPVGEDYKPISLNLSLSF 

LSSTSESSNPEQQQQQQPSVSKRVFSCNYCQRKFYSSQALGGHQNAHIOIERTLAKRAMLW 
VLLGSSPVEDQVAIMRLLPQQPLSCLPIiHGSGNGNMTSFRTLGIRAHSSAHDVSMTRQTP 
ETL IRNI ARFNQGYFGNCI PFYVEDDEAEMLWPGS FRQATNAVAVEAGNDNLGERKMDFL 
DVKQAMDMESSLPDLTLKL* 
>G370 (1..774) 

ATGGACGAAACCAACGGACGAAGAGAAACTCACGATTTCATGAACGTCAACGTTGAATCC 

TTCTCTC^GCTTCCITTCATCCGCCGTACTCCTCCCT^GAAAAAGCCGCCATTATTCGT 

CTCTTCGGCCAAGAGCTCGTCGGTGATAACTCCGACAACTTATCCGCAGAACCTTCTGAT 

CATCAAACCACTACCAAGAACGATGAGAGCTCTGAGAATATCAAGGACAAAGACAAAGAA 

AAAGATAAGGACAAAGACAAAGATAACAACAACAACAGGAGATTCGAGTGTCACTACTGC 

TTCAGAAACTTCCCAACTTCTCAAGCCCTAGGTGGACATCAAAACGCTCACAAACGTGAA 

CGTCAACACGCCAAACGCGGTTCCATGACATCATACCTTCATCATCATCAGCCTCATGAC 

CCTCACCACATCTACGGCTTCCTCAACAACCACCACCACCGTCACTATCCGTCTTGGACG 

ACGGAAGCTAGATCATACTACGGCGGAGGGGGACATCAAACGCCGTCGTACTACTCAAGG 

AATACTCTTGCTCCTCCTTCTTCTAACCCACCGACAATCAACGGAAGTCCTTTAGGTTTG 

TGGCGTGTACCGCCTTCC^CGTC^CAAATACTATTt^GGCGTTTACTCATCTTCACCA 

GCTTCAGCGTTTAGGTCGCATGAGCAAGAGACTAATAAGGAGCCTAATAACTGGCCGTAC 

AGATTGATGAAACCCAATGTGCAAGATCATGTGAGTCTCGATCTTCATCTCTGA 

>G370 Amino Acid Sequence (domain in aa coordinates: 97-117) 

MDETNGRRETHDFMNVNVESFSQLPFIRRTPPKEKAAIIRLFGQELVGDNSDNLSAEPSD 

HQTTTKM)ESSENIKDKDKEKDKDKDKDN^ 
RQHAKRGSMTSYLIHfflQPHDPHHIYGFLNNHHHRHYPSW 

imiAPPSSNPPTINGSPLGLWRVPPSTSTNTIQGVYSSSPASAFRSHEQETNKEPN^ 
RLMKPNVQDHVS LDLHL * 
>G385 (37.. 2202) 

TAGGGTTTGCTTTCAGTTTCCGGAGTATAAGAAAAGATGTTCGAGCCAAATATGCTGCTT 
GCGGCTATGAACAACGCAGACAGCAATAACCACAACTACAACCACGAAGACAACAATAAT 
GAAGGATTTCTTCGGGACGATGAATTCGACAGTCCGAATACTAAATCGGGAAGTGAGAAT 
CAAGAAGGAGGATCAGGAAACGACCAAGATCCTCTTCATCCTAACAAGAAGAAACGATAT 
CATCGACACACCCAACTTCAGATCCAGGAGATGGAAGCGTTCTTCAAAGAGTGTCCTCAC 
CCAGATGACAAGCAAAGGAAACAGCTAAGCCGTGAATTGAATTTGGAACCTCTTCAGGTC 
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AAATTCTGGTTCCAAAACAAACGTACCCAAATGAAGAATCATCACGAGCGGCATGAGAAC 

TCACATCTTCGGGCGGAGAACGAAAAGCTTCGAAACGACAACCTAAGATATCGAGAGGCT 

CTTGCAAATGCTTCGTGTCCTAATTGTGGTGGTCCAACAGCTATCGGAGAAATGTCATTC 

GACGAACACCAACTCCGTCTCGAAAATGCTCGATTAAGGGAAGAGATCGACCGTATATCC 

GCAATCGCAGCTAAATACGTAGGCAAGCCAGTCTCAAACTATCCACTTATGTCTCCTCCT 

CCTCTTCCTCCACGTCCACTAGAACTCGCCATGGGAAATATTGGAGGAGAAGCTTATGGA 

AACAATCCAAACGATCTCCTTAAGTCCATC^CTGCACGAACAGAATCTGA 

ATCATCGACTTATCCGTGGCTGCAATGGAAGAGCTCATGAGGATGGTTCAAGTAGACGAG 

CCTCTGTGGAAGAGTTTGGCTTTAGACGAAGAAGAATATGCAAGGACCTTTCCrAGAG^ 

ATCGGACCTAGACCGGCTGGATATAGATCAGAAGCTTCGCGAGAAAGCGCGGTTGTGATC 

ATGAATCATGTTAACATCGTTGAGATTCTGATGGATGTGAATCAATGGTCGACGAT^ 

GCGGGGATGGTTTCTAGAGCAATGACATTAGCGGTTTTATCGACAGGAGTTGCAGGAAAC 

TATAATGGAGCTCTTCAAGTGATGAGCGCAGAGTTTCAAGTTCCATCTCGATTAGTCCCA 

ACACGTGAAACCTATTTCGCACGTTACTGTAAACAACAAGGAGATGGTTCGTGGGCGGTT 

GTCGATATTTCGTTGGATAGTCTCCAACCAAATCCCCCGGCTAGATGCAGGCGGCGAGCT 

TCAGGATGTTTGATTCAAGAATTGCCAAATGGATATTCTAAGGTGACTTGGGTGGAGCAT 

GTGGAAGTTGATGACAGAGGAGTTCATAACTTATACAAACACATGGTTAGTACTGGTCAT 

GCCTTCGGTGCTAAACGCTGGGTAGCCATTCTTGACCGCCAATGCGAGCGGTTAGCTAGT 

GTCATGGCTACAAACATTTCCTCTGGAGAAGTTGGCGTGATAACCAACCAAGAAGGGAGG 

•AGGAGTATGCTGAAATTGGCAGAGCGGATGGTTATAAGCTTTTGTGCAGGAGTGAGTGCT 

TCAACCGCTCACACGTGGACTACATTGTCCGGTACAGGAGCTGAAGATGTTAGAGTGATG 

ACTAGGAAGAGTGTGGATGATCCAGGAAGGTCTCCTGGTATTGTTCTTAGTGCAGCCACT 

TCTTTTTGGATCCCTGTTCCTCCAAAGCGAGTCTTTGACTTCCTCAGAGACGAGAATTCA 

AGAAATGAGTGGGATATTCTGTCTAATGGAGGAGTTGTGCAAGAAATGGCACATATTGCT 

AACGGGAGGGATACCGGAAACTGTGTTTCTCTTCTTCGGGTAAATAGTGCAAACTCTAGC 

CAGAGCAATATGCTGATCCTACAAGAGAGCTGCATTGATCCTACAGCTTCCTTTGTGATC 

TATGCTCCAGTCGATATTGTAGCTATGAACATAGTGCTTAATGGAGGTGATCCAGACTAT 

GTGGCTCTGCTTCCATCAGGTTTTGCTATTCTTCCTGATGGTAATGCCAATAGTGGAGCC 

CCTGGAGGAGATGGAGGGTCGCTCTTGACTGTTGCTTTTCAGATTCTGGTTGACTCAGTT 

CCTACGGCTAAGCTGTCTCTTGGCTCTGTTGCAACTGTCAATAATCTAATAGCTTGCACT 

GTTGAGAGAATCAAAGCTTCAATGTCTTGTGAGACTGCTTGAAAACCATCCATTAGC 

>G385 Amino Acid Sequence (domain in AA coordinates: 60-123) 

MFEPNMLLAAMt^ADSNNHNYNHEDNNN^ 

HPNKKKRYHRHTQLQIQEMEAFFKECPHPDDKQRKQL^ 

NHHERHENSHLRAENEKLRNDNLRYREALANASCPNCGGPTAIGEMSFDEHQLRLENARL 
REEIDRISAIAAKYVGKPVSNYPLMSPPPLPPRPLELAMGNIGGEAYGNNPNDLLKSITA 
PTESDKPVIIDLSVAAMEELMRIWQVDEPLWKSLALDEEEYARTFPRGIGPRPAGYRSEA 
SRESAVVIMNHVNIVEILMDWQWSTIFAGMVSRAMTIA^ 

QVPSPLVPTRETYFARYCKQQGDGSWAWDISLDSLQPNPPARCRRRASGCLIQELPNGY 
S KVTWVEHVEVDDRGVHNL YKHMVSTGHAFGAKRWAI LDRQCERLAS VMATNI S SGEVG 
VITNQEGRRSMLKLAERMVI S FCAGVS ASTAHTWTTLSGTGAEDVRVMTRKS VDDPGRS P 
GIVLSAATSFWIPVPPKRVFDFLRDENSRNEWDILSNGGWQEMAHIANGRDTGNCVSLL 
RVNS ANS S QSNML I LQE S C IDPTAS FVI YAPVD I VAMNI VLNGGDPDYVALLPSGFAILP 
DGNANSGAPGGDGGSLLTVAFQILVDSVPTAKLSLGSVATVNNLIACTVERIKASMSCET 
A* 

>G439 (128.. 967) 

AGGGCTTCTTCTCTTTGTTTCTCCAATCTTTATTAGTTTATTTATTTATTTTGGTTATTG 
TATACAAATGGCAATGGCTTTAAACATGAATGCTTACGTAGACGAGTTCATGGAAGCTCT 
TGAACCATTCATGAAGGTAACTTCATCTTCTTCTAC 

ATTAACTCCTAATTTCATCCCTAATAATGACCAAGTCTTACCGGTATCTAACCAAACCGG 
TCCGATTGGGCTAAACCAGCTCACTCCAACACAAATCCTCCAAATTCAGACAGAGTTACA 
TCTCCGGCAAAACCAATCTCGTCGTCGCGCTGGTAGTCATCTTCTCACCGCTAAACCAAC 
CTCAATGAAGAAAATCGACGTAGCAACTAAACCGGTTAAACTATACCGAGGCGTAAGACA 
GAGGCAATGGGGTAAATGGGTAGCTGAGATTCGGCTACCTAAAAACCGAACCCGGTTATG 
GCTCGGTACGTTCGAAACGGCTCAAGAAGCTGCATTAGCTTACGATCAAGCAGCTCATAA 
GATCAGAGGAGACAACGCTCGTCTCAATTTCCCAGACATTGTTCGTCAAGGACACTATAA 
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ACAGATATTGTCTCCGTCTATCAACGCAAAGATCGAATCCATCTGCAATAGTTCTGATCT 
TCCACTGCCTCAGATCGAGAAACAGAACAAAACAGAGGAGGTGCTCTCTGGTTTTTCCAA 
ACCGGAGAAAGAACCGGAATTTGGGGAGATATACGGATGCGGATACTCGGGCTCATCTCC 
TGAGTCGGATATAACGTTGTTGGATTTCTCAAGCGACTGTGTGAAAGAAGATGAGAGTTT 
CTTGATGGGTTTGCACAAGTATCCTTCTTTGGAGATTGATTGGGACGCTATAGAGAAACT 
CTTCTGAATC(^TTTTATCITTTTGATTCATTTC 

AGAGCTTTGTAAGGGAAGTTCTTGAATGAGAGTTGCAGAGGACTAGTGGAACCTAACTCT 

GTTTTCTTTTGTAAGTATTGTTTATAATGGGCCGTT^ 

GCCCAAGTTTTTAAAAAAAAAAAAAAAAAAAAAAAAA 

>G439 Amino Acid Sequence (domain in AA coordinates: 110-177) 
MAMALNMNAYVDEFMEA^ 

GLNQLTPTQILQIQTELHLRQNQSRRRAGSHLLTAKPTSMKKIDYATKPVKLYRGVRQR 

WGKWVAEIRLPKNRTRLWLGTFETAQEAALAYDQAAHKIRGDNAIOjNFPDITO 

LSPSINAKIESICNSSDLPLPQIEKQNKTEEVLSGFSKPEKEPEFGEIYGCGYSGSSPES 

DITLLDFSSDCVKEDESFLMGLHKYPSLEIDWDAIEKLF* 

>G440 (237.. 1301) 

AAAAAATCACTGTl^CATAACACGTTTTTCTCTCTCACCCACCAAAAAAAAATCTTTTGT 

TCTTGTTACCAAAAAATCTCGTGATAAATCTCTTCAAACTTTGTTTTATTTTCTTCTTGA 

TTCTCTCGAAATCTCTCTCAACAAACCCAGAAACTTTCCTTGATTCGCAAGCTTTTCTTC 

CTTTTATATTCTTCATTTTGATGCGAATATAGAGAGAGTCCATAAAAGAAACAGTAATGG 

ACGAATATATTGATTTCCGACCATTGAAGTACACAGAGCACAAGACTTCAATGACTAAAT 

ACACCAAAAAGTCATCGGAAAAACTTTCCGGTGGTAAGTCATTGAAAAAGGTTAGTATTT 

GTTATACTGATCCTGACGCAACAGATTCATCAAGTGACGAAGACGAAGAAGATTTCTTGT 

TTCCTCGCCGGAGAGTCAAAAGATTCGTTAACGAGATCACTGTTGAGCCTAGCTGTAACA 

ACGTCGTCACCGGAGTTTCGATGAAAGATAGAAAGAGACTCTCTTCTTCCTCCGATGAAA 

CTCAATCTCCGGCGTCGAGTCGTCAACGTCCTAATAACAAAGTTTCAGTCTCCGGTCAGA 

TAAAGAAGTTCCGTGGTGTTAGACAACGGCCATGGGGGAAATGGGCGGCGGAGATTAGAG 

ATCCGGAGCAACGTCGGAGGATTTGGCTCGGGACTTTTGAGACGGCGGAGGAAGCTGCCG 

TGGTTTATGATAACGCCGCTATAAGACTCCGTGGACCGGACGCTTTAACTAATTTCTCCA 

TACCGCCTCAAGAAGAGGAAGAAGAAGAAGAACCGGAACCGGTTATTGAGGAGAAACCGG 

TTATTATGACGACGCCAACACCAACT^ACATCGAGTTCTGAATCAACTGAAGAAGATTTAC 

AAC^TCTCTCATCTCCTACTTCGGTTCTCAATCACCGGTCAGAAGAGATTCAACAAGTAC 

AACAACCGTTTAAATCAGCTAAACCCGAACCGGGGGTTTCAAATGCACCATGGTGGCATA 

CCGGGTTTAATACCGGTTTAGGTGAATCAGACGATTCATTTCCTTTGGATACTCCGTTTC 

TTGACAACTATTTCAATGAATCACCACCAGAGATGTCAATATTTGACCAACCAATGGATC 

AAATTTTCTGTGAAAATGATGATATCTTCAATGATATGTTGTTCTTGGGTGGTGAAACTA' 

TGAACATTGAAGATGAGTTAACAAGTTCTAGTATCAAAGATATGGGTTCAACGTTTAGTG 

ATTTTGATGATTCATTGATATCAGATCTATTAGTTGCTTAATATGATGATGAGAGTGAAG 

AAGAAACCATCAAGCAAATATCTATGGTGTGACTGAAAAATTTTGGTGTTACTTTTTTTT 

CTTTCATAAGTTCATGAGCTTTTTTGTTTCTTTTTTTTAATAATTTATTTAGTTTTGTCA 

GGAGCTTGTAAAACAGTTTTGGAGAAATAGTGGAAAAATAGTTTAATTAAAAAAAAAAAA 

AAAAAAA 

>G440 Amino Acid Sequence (domain in AA coordinates: 122-189) 
MDEYIDFRPLIGTITEHKTSMTKYTKKS 

LFPRRRVKRFVOTITVEPSCNNVVTGVSMKDRKRLSSSSDETQSPASSRQR 

QI KKFRGVRQRPWGKWAAE IRDPE QRRR IWLGTFETAEEAAWYDNAAIRIiRGPDALTNF 

SIPPQEEEEEEEPEPVIEEKPVIMTTPTPTTSSSESTEEDLQHLSSPTSVLNHRSEEIQQ 

VQQPFKSAKPEPGVSNAPWWHTGFNTGLGESDDSFPLDTPFLDNYFNESPPEMSIFDQPM 

DQIFCENDDIFNDMLFLGGETMNIEDELTSSSIKDMGSTFSDFDDSLISDIiLVA* 

>G5 {417.. 1421)' 

TTTTTTTTTTGCAATCTCCCCCTAATCTGTTGTTTCTCGCTTCTTCTTCTGTTAATCATC % 

TGTCTTTCAAAAAGAAAGAAAAAAGAAAAATTCGATTTCTGGGTTTGTTTTTGTCATACA 

GAAAAAAATCAAGCTTATGAATTTGTGTTTAATTTTTTGTTTTAATTTGAAAGGCAGGTT 

TTTTCAGAACGAGATCGTTTTTTC?U^TTTCTTCTGATTTTACCTCTTTTTTTCTTCTTA 

GATTTTAGTGAATCGAGGGTGAAATTTTTGATTCCCTCTTTTCGGATCTACACAGAGGTT 

GCTTATTTCAAACCTTTTAGATCCATTTTTTTTTAATTTTCTCGGAAAAATCCCTGTT^ 

TTTACTTTTTTATAAGTCTCAGGTTCAATTTTTTCGGATTCAAATTTTTATTTTAAATGG 
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(^GCTGCTATGAATTTGTACACTTGTAGCAGATCGTTTCAAGACTCTGGTGGTGAACTCA 

TGGACGCGCTTGTACCTTTTATCAAAAGCGTTTCCGATTCTCCTTCTTCT 

CGTCTGCGTCTGCGTTTCTTCACCCCTCTGCGTTTTCTCTCCCTCCTCTCCCCGGTTATT 

ACCCGGATTC^CGTTCTTGACCCAACCGTTTTCATACGGGTCGGATCTTCA^ 

GGTCATTAATCGGACTCAACAACCTCTCTTCTTCTCAGATCCACCAGATCCAGTCTCAGA 

TCCATCATCCTCTTCCTCCGACGCATCACAACAACAACAACTCTTTCT 

GCCCAAAGCCGTTACTGATGAAGCAATCTGGAGTCGCTGGATCITGTTTCGCTTACGGCT 

CAGGTGTTCCTTCGAAGCCGACGAAGCTTTACAGAGGTGTGAGGCAACGTCACTGGGGAA 

AATGGGTGGCTGAGATCCGTTTGCCGAGAAATCGGACTCGTCTCTGGCTTGGGACTTTTG 

ACACGGCGGAGGAAGCTGCGTTGGCCTATGATAAGGCGGCGTACAAGCTGCGCGGCGATT 

TCGCCCGGCTTAACTTCCCTAACCTACGTCATAACGGATTTCACATCGGAGGCGATTTCG 

GTGAATATAAACCTCTTCACTCCTCAGTCGACGCTAAGCTTGAAGCTATTTGTAAAAGCA 

TGGCGGAGACTCAGAAACAGGACAAATCGACGAAATCATCGAAGAAACGTGAGAAGAAGG 

TTTCGTCGCCAGATCTATCGGAGAAAGTGAAGGCGGAGGAGAATT.CGGTTTCGATCGGTG 

GATCTCCACCGGTGACGGAGTTTGAAGAGTCCACCGCTGGATCTTCGCCGTTGTCGGACT 

TGACGTTCGCTGACCCGGAGGAGCCGCCGCAGTGGAACGAGACGTTCTCGTTGGAGAAGT 

ATCCGTCGTACGAGATCGATTGGGATTCGATTCTAGCTTAGGGGCAAAATAGGAAATTCA 

GCCGCTTGCAATGGAGTTTTTGTGAAATTGCATGACTGGCCCAAGAGTAATTAA 

ATGGATTAGTGTTAAATTTCGTATGTTAATATTTGTATTATGGTTTGTATTAGTCTCTCT 

GTGTCGGTCCAGCTTGCGGTTTTTTGTCAGGCTCGACCATGCCACAGTTTTCATTTTATG 

TAATCTTTTTTTCTTTTGTCTTATGTAATTTGTAGCTTCAGTTTCTTCATCTATAATGCA 

ATTTTATTATGATTATGTG 

>G5 Amino Acid Sequence (domain in AA coordinates: 149-216) 

MAAAMNLYTCSRSFQDSGGELMDALVPFIKSVSDSPSSSSAASASAFLHPSAFSLPPLPG 

YYPDSTFLTQPFSYGSDLQQTGSLIGLNNLSSSQI^ 

LSPKPLLMKQSGVAGSCFAYGSGVPSKPTKLYRGVRQRHWGKWAEIRLPI^RTRLWLGT 

FDTAEEAALAYDKAAYKLRGDFARLNFPNLRHNGFHIGGDFGEYKPLHSSVDAKLEAICK 

SMAETQKQDKSTKSSKKREKKVSSPDLSEKVKAEENSVSIGGSPPVTEFEESTAGSSPLS 

DLTFADPEEPPQWNETFSLEKYPSYEIDWDSILA* 

>G550 (1..1374) 

ATGGCTGATCCGGCGATTAAGCTCTTTGGAAAGACGATTCCTTTACCTGAGCTTGGTGTT 
GTTGATTCTTCTTCTAGCTATACCGGATTTTTAACCGAAACTCAGATTCCTGTTCGGTTA 
TCAGATTCGTGTACCGGCGATGATGATGATGAAGAGATGGGTGATTCCGGTTTAGGACGA 
GAAGAAGGTGATGATGTTGGTGATGGTGGAGGAGAGAGCGAGACTGATAAAAAGGAAGAA 
AAAGATAGTGAGTGTCAGGAAGAGTCATTGAGGAATGAATCTAATGATGTTACTACTACT 
ACATCGGGTATAACTGAAAAAACGGAAACAACAAAAGCTGCAAAGACGAATGAAGAGTCA 
GGTGGTACTGCTTGCTCTCAAGAGGGGAAGTTAAAGAAACCTGATAAGATTCTACCGTGT 
CCGCGATGTAACAGCATGGAAACCAAGTTCTGTTACTACAACAACTATAATGTTAACCAA 
CCTCGCCATTTCTGCAAGAAATGTCAGAGATATTGGACAGCTGGTGGAACGATGAGGAAT 
GTTCCGGTTGGTGCTGGGAGACGTAAGAATAAGAGTCCAGCTTCTCATTATAACCGTCAT 
GTAAGTATAACATCTGCGGAAGCTATGCAGAAGGTGGCGAGAACTGATCTTCAACATCCT 
AATGGTGCAAATCTTCTCACTTTTGGCTCTGATTCTGTGCTTTGTGAATCTATGGCTTCT 
GGATTGAATCTTGTTGAGAAGTCATTGTTGAAGACACAAACTGTATTGCAAGAACCCAAT 
GAAGGCTTGAAGATTACGGTTCCGTTAAACCAGACAAACGAAGAAGCTGGAACAGTCAGC 
CCGTTACCAAAAGTTCCATGCTTTCCAGGACCACCACCAACTTGGCCTTACGCTTGGAAC 
GGAGTTTCGTGGACGATTTTACCGTTTTACCCTCCACCGGCTTACTGGAGCTGCCCGGGG 
GTTTCACCGGGGGCATGGAACAGCTTCACATGGATGCCACAACCCAATTCACCATCTGGT 
TCCAATCCAAATTCTCCTACACTAGGTAAACATTCACGTGACGAGAACGCTGCTGAACCA 
GGAACCGCTTTTGATGAAACCGAGTCACTTGGTAGGGAGAAAAGCAAACCCGAGAGATGC 
TTGTGGGTTCCCAAGACGCTGAGGATTGATGATCCAGAGGAAGCTGCTAAAAGTTCCATC 
TGGGAAACATTAGGGATOVAAAAAGACGAAAATGCGGATACTTTCGGAGCTTTCAGATCA 
TCAACCAAAGAAAAAAGCAGTCTTTCTGAAGGAAGACTTCCGGGAAGAAGACCGGAGTTG 
. CAAGCGAATCCTGCTGCTCTTTCTAGGTGAGCAAACTTCCATGAGAGCTCATAG 
>G550 Amino Acid Sequence (domain in AA coordinates: 134-180) 
MADPAIKLFGKTIPLPELGVVDSSSSYTGFLTETQIPVRLSDSCTGDDDDEEMGDSGLGR 
EEGDDVGDGGGESETDKKEEKDSECQEESLR3^SNDVTTTTSGITEKTETTKAAKTNEES 
GGTACSQEGKLKKPDKILPCPRCNSMETKFCYYNNYimJQPRHFCKKCQRYWTAGGTl^ 
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VPVGAGRRKNKSPASHYNRHVSITSAEAMQKVARTDLQHPNGANLLTFGSDSVLCESMAS 
GLNLVEKSLLCTQTVLQEPNEGLKITVPL 

GVSWTILPFYPPPAYWSCPGVSPGAWNSFTWMPQPNSPSGSNPNSPTLGKHSRDENAAEP 
GTAFDETES LGREKS KPERCLWVPKTLRIDDPEEAAKS S I WETLG I KKDENADTFG AFRS 
STKEKSSLSEGRLPGRRPELQANPAALSRSANFHESS * 
>G670 (28.. 1152) 

CACAGCATTGCAGCTGTGAATAACTAAATGGGGAGACATTCTTGCTGTTACAAACAAAAG 
CTGAGGAAAGGGCTTTGGTCTCCTGAAGAAGACGAGAAGCTTCTTACTCACATCACCAAT 
CACGGCCATGGCTGCTGGAGCTCT^TCCCTAAACTCGCTGGTTTGCAGAGATGTGGGAAG 
AGTTGTCGACTCGAGCAGATCTGGTACCGCCGACTAAGATGGATCAATTACTTGAGACCT 
GATTTAAAGAGAGGAGCTTTTTCTCCTGAAGAAGAGAA^^^ 

GTCCTTGGAAACAGATGGTCACAGATTGCGTCAAGGCTTCCGGGTAGAACCGACAACGAG 

ATCAAGAATCTATGGAACTCAAGCATCAAGAAGAAACTGAAACAAAGAGGCATTGACCCA 

AACACACACAAGC CCATCT CTG AAGTGG AGAGTTTT AG CGACAAAG ACAAAC C AACAAC A 

AGCAACAACAAAAGAAGCGGTAACGATCACAAGTCTCCTAGTTCCTCTTCTGCGACTAAC 

CAAGACTTCT^CCTCGAAAGGCCATCTGATTTATCCGACTACTTCGGATTTCAGAAGCTT 

AACTTCAACTCCAATCTAGGACTCTCTGTTACAACTGATTCTTCACTCTGCTCGATGATT 

CCGCCGCAGTTTAGCCCCGGGAACATGGTTGGTTCTGTCCTTCAGACACCAGTATGCGTA 

AAGCCCTCGATTAGTCTTCCTCCCGACAACAACAGTTCGAGTCCTATCTCCGGAGGAGAT 

CATGTGAAATTGGCTGCACCAAACTGGGAATTTCAGACAAACAACAATAATACCTCAAAT 

TTCTTCGACAATGGCGGATTCTCATGGTCTATCCCAAATTCTTCTACTTCTTCTTCACAA 

GTCAAACCAAATCATAACTTCGAAGAAATAAAATGGTCAGAGTATTTGAACACACCGTTC 

TTCATAGGGAGTACTGTACAGAGTCAAACCTCTCAACCAATCTACATCAAATCAGAAACA 

GATTACTTAGCCAATGTTTCAAACATGACAGATCCTTGGAGCCAAAACGAGAACTTGGGC 

ACAACTGAAACTAGTGACGTGTTCTCCAAGGATCTTCAGAGAATGGCCGTCTCTTTTGGT 

CAGTCCCTTTAGCTTTTTTCTTTCTTTCTTTCTTATTTCTAACAGATGTAGAGAACATAA 

AGATATACAAATACATACAATGTCAATACGTACAGTGGATTTAAGTGTTCTGTATATTTC 

ATGGGCGAGCTGTCTTTATTTTTATGTTTAAAAAAAAAAAAAAAAAA 

>G670 Amino Acid Sequence (domain in AA coordinates: 14-122) 

MGRHSCCYKQKLRKGLWSPEEDEKLLTHITimGHGOT^ 

RRLRWINYLRPDLKRGAFSPEEENLIVELHAVLGNRWSQIASRLPGRTDNEIKNLWNSSI 
KKKLKQRGIDPNTHKPISEVESFSDKDKPTTSNNKRSGNDHKSPSSSSATNQDFFLERPS 
DLSDYFGFQKLNFNSNLGLSVTTDSSLCSMI^^ 
NNSSSPISGGDHVKLAAPNWEFQTNNNNTSNFFDNGGFSW 

IKWSEYLNTPFFIGSTVQSQTSQPIYIKSETDYLA1TVSNMTDPWSQNENLGTTETSDVFS 

KDLQRMAVSFGQSL* 

>G760 (175.. 1878) 

TGCTTAATTCCAATGCCATCGTGATCGATTCATCTCTCTCTCTCTCTTCCAATTTTCCCA 
ATTCTTTTTTAAAACCCTAATTTTTCAGATATCTGATTATCTCTTGTATTTCTTCTACTC 
GATTTGCTCCCATAAAAACCCTTACTTTCTTCAAGTTCTGGTTTTCACCGATTGATGGGT 
CGTGGCTCAGTGACGTCGCTTGCTCCTGGGTTCCGTTTTCACCCGACGGATGAGGAACTT 
GTTCGCTACTACCTTAAGCGTAAGGTCTGCAACAAACCCTTTAAGTTCGATGCTATTTCC 
GTCACCGACATATACAAGTCTGAGCCTTGGGATCTACCAGATAAGTCGAAGCTGAAAAGT 
AGAGACTTGGAATGGTACTTCTTTAGTATGCTGGATAAGAAGTACAGTAATGGTTCCAAG 
ACGAATCGTGCTACGGAGAAAGGGTATTGGAAGACGACTGGGAAAGATCGGGAGATTCGT 
AATGGTTCAAGAGTCGTTGGGATGAAGAAGACACTTGTTTATCACAAGGGTCGAGCTCCT 
CGTGGTGAAAGGACCAATTGGGTTATGCATGAGTATCGGCTTTCTGATGAGGACTTGAAG 
AAAGCTGGTGTGCGACAAGAAGCATATGTGTTATGTAGGATATTCCAGAAAAGTGGTACG 
GGTCCTAAGAATGGGGAGCAGTATGGTGCTCCTTATCTTGAGGAGGAGTGGGAAGAAGAT 
GGAATGACTTATGTAeCTGCTCAAGATGCTTTCAGTGAAGGATTGGCTTTGAATGATGAT 
GTTTATGTCGATATTGATGACATTGACGAGAAGCCCGAAAATCTGGTGGTCTATGATGCC 
GTTCCTATTCTACCTAACTATTGTCATGGGGAATCAAGTAACAATGTTGAATCAGGCAAT 
TACTCAGACTCTGGAAATTACATTCAACCAGGAAACAATGTTGTCGACTCTGGTGGGTAC 
TTTGAACAACCAATTGAAACTTTTGAGGAAGATCGGAAGCCTATTATACGGGAGGGTAGC 
ATTCAGCCTTGTTCTCTGTTTCCAGAGGAACAAATTGGCTGTGGTGTGCAAGACGAAAAT 
GTGGTGAATCTGGAATCTTCCAACAATAATGTGTTTGTAGCTGATACATGCTACAGTGAC 
ATTCCTATTGATCATAACTATTTACCGGATGAGCCATTCATGGATCCTAATAACAATCTT 
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CCACTCAACGATGGTCTGTACCTGGAAACGAATGATCTCAGCTGTGCTCAACAAGATGAT 
TTTAACTTCGAAGATTATCTCAGCTTCTTO^ 

CTATTAATGGGACCTGAAGATTTTCTTCCCAACCAAGAAGCCCTTGACCAGAAACCTGCC 
CCTAAAGAATTGGAGAAGGAGGTCGCAGGAGGCAAAGAGGCAGTGGAGGAAAAGGAAAGT 
GGCGJAGGATCTTCTTCAAAACAAGATACAGATTTCAAGGACTCT 

TACCCATTTCTCAAAAAGACGAGCCACATGCTTGGAGCCATTCCTACTCCATCTTCATTT 

GCTTCAC^GTTCCAAACAAAGGACGC^U^TGCGTCTACACGCAGCACAATCTTCTGGTTCA 

GTTCACGTGACTGCAGGTATGATGAGAATATCAAACATGACTCTAGCAGCGGACAGCGGT 

ATGGGCnXSGTCATATGACAAGAACGGTAACCTCAACGTAGTCCTTTCATTCGGGGTAGTC 

CAACAGGATGATGCGATGACTGCCTCGGGAAGCAAGACAGGAATTACGGCGACAAGAGCT 

ATGTTAGTCTTCATGTGTTTATGGGTTCTCCTACTCTCTGTTAGCTTCAAAATAGTAACC 

ATGGTGTCTGCTCGGTAATAGGATCAAAGTTGAATCGTCTCAAAGACTTTTTTTGGTGTT 

TGTACCTCTCCAATCATATAGCCTTTAACTTTGGCA 

TTTTAAAAAAAAAAAAAAAAA 

>G760 Amino Acid Sequence (domain in AA coordinates: 12-156) 

MGRGSVTSLAPGFRFHPTDEELTOYYLKRKVCNK^ 

KSRDLEWYFFSMLDKKySNGSKTimATEKGYWKTTGKDREIR^ 

APRGERTNWVMHEYRLSDEDLKKAGVPQEAYVLCRIFQKSGTGPKNGEQYGAPYLEEEWE 
EDGMTYVPAQDAFSEGIiALNDDVYVD IDDIDEKPENLVVYDAVPI LPNYCHGESSNNVES 
GNYSDSGNYIQPGNNWDSGGYFEQPIETFEEDRKPIIREGSIQPCSLFPEEQIGCGVQD 
ENVVNLESSNNNVFVMTCTSDIPIDH^ 

DDFNFEDYLSFFDDEGLTFDDSLLMGPEDFLPNQEALDQKPAPKELEKEVAGGKEAVEEK 

ESGEGSSSKQDTOFKDFDSAPKYPFLK3CTSHMLGAIPTPSSFASQFQTKDAMRLHAAQSS 

GSVHVTAGMI^ISNMTLAADSGMGWSYD 

RAMLVFMCLWVLLLSVSFKIVTMVSAR* 

>G831 (92.. 1987) 

TTCTTTCATCGTTGTGTCTATTATAAATATATGTCAATTTGGTTTCTAAAAAATTCTACC 
ATTGATTGATTGATTTTTTTTTCTTTAAGAGATGAATTTATTTACAAGAATCTCATCTCG 
GACTAAGAAGGCCAATCTTTACTACGTAACCCTAGTTGCTCTTCTCTGCATCGCTAGCTA 
CCTTCTCGGTATTTGGCAAAACACGGCGGTTAATCCACGCGCCGCCTTCGATGATTCAGA 
CGGTACACCGTGCGAGGGATTCACCAGACCTAATTCTACGAAAGATCTCGACTTCGACGC 
GCATCACAACATTCAAGATCCACCTCCGGTGACGGAAACCGCCGTTAGTTTCCCGTCGTG 
TGCCGCCGCGTTGAGCGAGCACACGCCATGCGAAGACGCGAAGCGATCGTTGAAATTCTC 
GAGGGAGAGATTGGAGTATAGGCAAAGGCATTGTCCCGAGAGAGAAGAAATCTTGAAGTG 
CAGAATTCCGGCGCCGTACGGTTACAAAACGCCGTTCCGATGGCCGGCGAGTCGTGACGT 
GGCGTGGTTCGCTAATGTGCCTCACACGGAGCTTACGGTTGAGAAAAAGAATCAGAATTG 
GGTCCGGTACGAGAATGATCGGTTTTGGTTCCCTGGTGGAGGTACGATGTTTCCACGTGG 
CGCTGATGCTTACATTGATGATATCGGACGGTTGATTGATCTCAGCGACGGCTCTATCCG 
TACAGCCATCGATACCGGTTGCGGGGTGGCTAGCTTCGGTGCATATCTTTTATCAAGAAA 
CATTACAACGATGTCATTTGCACCAAGAGACACACACGAAGCTCAAGTCCAGTTCGCACT 
CGAGCGTGGTGTGCCGGCGATGATCGGAATC^TGGCTACAATCCGCCTACCGTACCCTTC 
TAGAGCCTTTGATTTAGCACATTGCTCTCGTTGCCTTATTCCGTGGGGCCAAAACGATGG 
GGCTTACTTGATGGAGGTGGATAGGGTTTTAAGACCAGGAGGGTACTGGATACTTTCTGG 
ACCGCCGATTAATTGGCAGAAACGGTGGAAAGGGTGGGAACGGACCATGGATGATTTGAA 
TGCAGAGC^GACTCAGATCGAGCAGGTCGCGAGAAGCTTGTGTTGGAAGAAAGTTGTTCA 
AAGAGATGATCTTGCTATTTGGCAAAAACCCTTTAACCACATTGACTGTAAGAAAACCAG 
AGAGGTTTTGAAAAATCCGGAGTTTTGTCGTCATGATCAAGATCCCGACATGGCCTGGTA 
TACGAAGATGGATTGTTGTTTGACACCATTACCTGAAGTTGATGACGCTGAGGATCTAAA 
GACGGTGGCCGGAGGGAAGGTAGAAAAGTGGCCGGCTAGATTAAACGCGATTCCTCCGAG 
AGTAAACAAAGGCGCTCTCGAGGAAATCACACCTC 

GTGGAAACAGAGAGTTTCTTATTACAAGAAGTTAGATTACCAGTTGGGTGAAACCGGGAG 
ATACAGAAACTTAGTCGACATGAACGCTTACCTCGGTGGATTCGCGGCGGCTCTAGCGGA 
TGATCCGGTCTGGGTCATGAACGTTGTCCCGGTCGAGGCTAAGCTCAATACGCTCGGTGT 
CATCTACGAGCGTGGTCTAATCGGAACGTATCAAAACTGGTGTGAAGCCATGTCGACGTA 
TC(^GAACGTATGATTTTATCCATGCTGACTCGGTTTTCAC^TTGTACC^GGTC^TG 
TGAACCGGAGGAGATATTGTTGGAGATGGACCGAATTCTTAGACCGGGTGGTGGTGTGAT 
TATAAGAGATGACGTGGACGTTTTGATCAAGGTTAAGGAATTAACCAAAGGATTAGAATG 
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GGAAGGTAGAATTGCTGACCACGAGAAGGGTCCTCATGAAAGAGAGAAGATTTACTATGC 
GGTGAAACAGTATTGGACCGTTCCTGCGCCTGATGAAGATAAAAACAACACTAGTGCTCT 
CTCCTGATTTTTGAGTTTTTTTTTTCTTACAATGTTTTT^ 
TATACAACAATAAATTCTCAATAATTGTTGTCGCGGCCG 

>G831 Amino Acid Sequence (domain in AA coordinates: 470-591) 
MNLFTRISSRTKKANLYYVTLVALL^^ 

NSTKDLDFDAHHNIQDPPPVTETAVSFPSCW^SEHTPCEDAKRSLKFSRERLEYRQRH 

CPEREEI LKCRI PAPYGYKTPFRWPASI^VAWFANVPHTELTVEKKNQNWVRYENDRFWF 

PGGGTMFPRGADAYIDDIGRLIDLSDGSIRTAIDTGCGVASFGAYLLSRNITTMSFAPRD 

THEAQVQFALERGVPAMIGIMATIRLPYPSRAFDLAHCSRCLIPWGQNDGAYLMEVDRVL 

RPGGYWILSGPPINWQKRWKGWERTMDDLNAEQTQIEQVARSLCWKKVVQRDDLAIWQKP 

F^IDCKKTREVLKNPEFCRHDQDPDMAWYTKMDSCLTPLPETODAEDLKTV 

PARLNAI PPRWKGALEE ITPEAFLENTKLWKQRVS YYKKLDYQIiGETGRYRNLVDMNAY 

LGGFAAALADDPVVTVMNVVPVEAKLNTLGVI YERGL I GTYQNWCEAMSTYPRTYDF IHAD 

SVFTLYQGQCEPBEILLEMDRILRPGGGVIIRDDVDVLIKVKELTKGLEWEGRIADHEKG 

PHEREKI YYAVKQYWTVPAPDEDKNNTS ALS * 

>G864 (503.. 1534) 

TGCAAAAACATTTTCTTGTCTCTCCTCTG 

CTAGAAAAACCC^GCAAAGCTTTAACCCCTTCCTCCTCCAAAAGTAGCATCTTCCTCTT 
TTTCTATTTCTCCTTTCCTCTTCTTATCTCTCTCTCGTTTGTGAACGATTCCTTAAGAAT 
ATAACCAAAAGCCCTTTTCTCCTTTCTTCAACTTTCCGGGAAAAATCTTCACGCAGCAAG 
GTTTCTCTCTCGGCTCTCGCAGTGTTTTTCGGGCCTTTTGTTCTTTCTATAAAAAAAAAA 
TTCGCGTCCTTTAAGAAAACTTTTTCCACCTAGAGAAGAAGAAGAGTATCACTCTTGTTG 
TTCAAGTTTCTCTCTTTAATAAAAAATCCATCTTTATTCTTTGTCTTCTTTCCTTTTTGC 
TTTCCCTAATCTCTATGTTATAAACACACAGAGAGAAACAAAGTCACAGTCTCGAGTCIAA 
AAACAGAGAATACGAAAGAAAAATGGAAGCGGAGAAGAAAATGGTTCTACCGAGAATCAA 
ATTCACAGAGCACAAAACCAACACGACAACAATCGTATCGGAGT^ 

AACCAGGATTCTTCGTATCTCAGTCACTGACCCAGACGCTACTGATTCCTCCAGTGACGA 
CGAAGAAGAAGAACATCAACGCTTTGTCTCTAAACGCCGTCGTGTTAAGAAGTTTGTC^ 
CGAAGTCTATCTCGATTCCGGTGCTGTTGTTACTGGTAGTTGTGGTCAAATGGAGTCGAA 
GAAGAGACAAAAGAGAGCGGTTAAATCGGAGTCTACTGTTTCTCCGGTTGTTTCAGCGAC 
GACGACTACGACGGGAGAGAAGAAGTTCCGAGGAGTGAGACAGCGTCCATGGGGAAAATG 
GGCGGCGGAGATAAGAGATCCGTTGAAACGTGTACGGCTCTGGTTAGGTACTTACAACAC 
GGCGGAAGAAGCTGCTATGGTTTACGATAACGCCGCTATTCAGCTTCGTGGTCCCGACGC 
TCTGACTAATTTCTCAGTCACTCCGACAACAGCGACGGAGAAGAAAGCCCCACCACCGTC 
TCCGGTGAAGAAGAAGAAGAAGAAAAACAACAAAAGCAAAAAATCCGTTACTGCTTCTTC 
CTCCATCAGCAGAAGCAGCAGCAACGATTGTCTCTGCTCTCCGGTGTCTGTTCTCCGATC 
TCCTTTCGCCGTCGACGAATTCTCCGGCATTTCTTCATCACCAGTCGCGGCCGTTGTAGT 
CAAGGAAGAGCCATCCATGACAACGGTATCTGAAACTTTCTCTGATTTCTCGGCGCCCTT 
GTTCTCAGATGATGACGTGTTCGATTTCCGGAGCTCAGTGGTTCCCGACTATCTCGGCGG 
CGATTTATTTGGGGAAGATCTATTCACGGCGGATATGTGTACGGATATGAACTTCGGATT 
CGATTTCGGATCCGGATTATCCAGCTGGCACATGGAGGACCATTTTCAAGATATCGGGGA 
TCTATTCGGGTCGGATCCTCTTTTAGCTGTTTAATAATATTTTAAATAAATAAATAGTTA 
TACCGGCCGTTACTAAACGGAACCGGAGAAAGTTTTGTATACCGGTGACATAAAATCTCG 
GTTATGTTCGTAATCTTTTTTTCTTTGTTATATATAAAAATATGAATGAAACTGAATTAA 
TGTAAGTTAATGGTGATAATTATTAACGTTTTAAGTTTTGAAAAAAAAAAAAAAAAAAAA 
AAAAAAA 

>G864 Amino Acid Sequence (domain in AA coordinates: 119-186) 
MEAEKKMVLPRI KFTEHKTNTTT I VS ELTNTHQTR I LR I S VTDPDATDS S SDDEEEEHQR 
FVSKRRRVKKFVNEVYLDSGAVVTGSCGQMESKKRQKRAV^ 

KFRGVRQRPWGKWAAE IRDPLKRVRLWLGTYNTAEEAAMVYDNAAI QLRGPDALTNF S VT 
PTTATEKKAPPPSPVKKKKKKNNKSKKSVTASSS ISRSSSNDCLCSBVSVLRSPFAVDEF 
SGISSSPVAAVWKEEPSMTTVSETFSDFSAPLFSDDDVFDFRSSWPDYLGGDLFGEDL 
FTADMCTDMNFGFDFGSGLSSWHMEDHFQDIGDLFGSDPLLAV* 
>G884 (31.. 1575) 

TTTTTTTTTGTTTGTTAATTTTGGGGATCGATGTCGGAAAAGGAAGAAGCTCCGTCGACA 
TCGAAGTCCACCGGAGCTCCGTCGCGTCCGACTTTATCTCTTCCTCCACGGCCGTTTAGT 
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GAGATGTTCTTTAACGGTGGCGTTGGATTCAGTCCTGGTCCGATGACTCTGGTCTCTAAT 

ATGTTCCCTGATTCCGATGAGTTTAGGTCTTTCTCTCAGCTTCTCGCTGGAGCCATGTCT 

TCTCCAGCGACTGCAGCTGCTGCTGCTGCTGCTGCGACGGCTAGTGATTACCAGAGACTT 

GGTGAAGGGACTAATAGCTCTAGTGGTGATGTTGACCCGAGATTCAAGCAAAACAGACCA 

ACCGGTTTGATGATTTCTCAATCTCAATCGCCGTCGATGTTCACCGTACCGCCTGGTTTA 

AGTCCAGCTATGTTGCTCGATTCACCAAGCTTTTOGGGTCTTTTCTCTCCCGTTCAGGGA 

TGATATGGAATGACACATCAGCAAGCTCTAGCT 

AATGCCAATATGCAACCAGAAACAGAGTACCCTCCT 

TCGGGTCAAGCGCAGATCCCGACCTCGGCTCCACTACCAGCTCAAAGAGAAACCTCAGAT 
GTAACCATCATAGAGGACAGGTCACAACAGCCT 

GGCTATAACTGGCGAAAATATGGGCAAAAGCAAGTTAAAGGTAGCGAGTTTCCACGAAGC 
TATTACAAGTGTACTAATCCAGGATGTCCTGTCAAGAAGAAGGTTGAGAGATCTCTTGAT 
GGACAAGTAACGGAGATTATCTACAAAGGTC^GCACAATCATGAACCTCCTCAAAACACT 
AAGCGAGGTAACAAAGATAACACCGCGAATATAAATGGGAGTTCGATAAATAACAATCGC 
GGGAGTTCTGAATTGGGGGCATCACAGTTTCAAACTAATAGCTCCAACAAGACTAAGAGA 
GAGC^UVCATGAAGCAGTAAGTCAAGCTACGACAACAGAGCACTTGTCTGAGGCAAGTGAC 
GGTGAAGAAGTTGGTAATGGAGAAACTGATGTGAGAGAGAAAGATGAGAATGAGCCTGAT 
CCCAAGAGAAGAAGTAC^GAAGTTCGGATTTCAGAACCAGCTCCTGCTGCTTCACATAGA 
ACTGTGACAGAGCCTAGAATTATTGTCCAAACGACGAGTGAAGTTGATCTTCTAGATGAT 
GGATATAGGTGGCGTAAATATGGACAGAAAGTTGTCAAAGGGAATCCTTATCCGAGGAGC 
TACTACAAGTGCAGAACACCAGGATGTGGTGTGAGGAAAC^ 

GATCCAAAAGCTGTAGTAACAACATATGAAGGAAAACATAACCATGACCTTCCCGCTGCT 
AAATCAAGCAGCCATGCCGCTGCAGCGGCACAGTTAAGGCCAGATAATCGACCTGGCGGT 
TTGGCTAACTTAAATCAACAGCAGCAGCAACAGCCCGTTGCGCGGCTAAGGCTTAAAGAA 
GAGCAAACAACTTGAGAGAAGAAAACTCTTGACCGTTTTTCATTACAAAAGCTTTCAAAT 
TCCACTCACA(^CTTGTCTGAAAAATCTAGCAGTTTG(^GGAAAGAAACAGCTTCAAGAG 
GTTGTAGTTCTTCTATGTTCTGGTGTAAAACTTAAAAGCTTTTTAGGGTTTTCAGATTTC 
TGTTTACTAATACTGTATGTGAATTCTTTTGTACATGAGGAAGT^AAATTACAGGGGGATA 
TTTTGTGTTGTATCTTTTGTGTTATTGTTTCAGTAAAAGATAGGTCTTACATTTTGTGTA 
AAAAAAAAAAAAAAAAAAA 

>G884 Amino Acid Sequence (conserved domain in AA coordinates : 22 7 -2 85, 407-465) 

MSEKEEAPSTSKSTGAPSRPTLSLPPRPFSEMFFNGGVGFSPGPMTLVSNMFPDSDEFRS 

FS QLLAGAMS S PATAAAAAAAATASDYQRLGEGTNS S SGDVD PRFKQNRPTGLMI S QSQS 

PSMFTVPPGLS PAMLLDS PS FLGLF SPVQG S YGMTHQQAL AQVTAQAVQANANMQP QTEY 

PPPSQVQSFSSGQAQIPTSAPLPAQRETSDVTIIEHRSQQPLNVDKPADDGYNWRKYGQK 

QVTCGSEFPRS YYKCTNPGCPVKKKVERSLDGQVTE 1 1 YKGQHNHEPPQNTKRGNKDNTAN 

INGSSINlSn^GSSELGASQFQTNSSNKTKREQH^^ 

VREKDENEPDPKRRSTEVRI SEPAPAASHRTVTEPRI I VQTTSEVDLLDDGYRWRKYGQK 
VV1CGNPYPRS YYKCTTPGCGVRKHVERAATDPKAWTTYEGKHNm 
QLRPDNRPGGLANIiNQQQQQQPVARLRLKEEQTT* 
>G898 (161.. 772) 

GAAAAAAAGATTCAAAAACCCTAGATTTCACAAAATCGATTGGCTGTCAAATTTCTCTCC 
GGCGATTTTCCTCGAGTGAAATTCGGCTCAAGGTGATTATAGCGATCATCGAATCAAATT 
GATTGAAGAGGTACAAAGGTTAGTTACTTTGAGCTGAAAGATGAACACGTCAGAGGTGAG 
AGTACCTCGAGGAAATCGACGGAGGAAAGCTGTGATTGATCTGAATGCGGTACCTGTTGA 
TCAAGAAGGGACCTCTGCTTCTGTTAGAACTCTTACGGTGCCTATTACACCGTCTCAGCC 
TGCTCCTACGATGATTGATGTCGATGCTATTGAGGATGATGTTATTGAATCATCCGCTAG 
TGCTTTTGCTGAAGCTAAAAGCAAATCAAGAAATGCACGTCGGAGACCTTTGATGGTTGA 
TGTAGAGTCAGGAGGTACGACTAGATTCCCTGCCAACATAAGC7UVCAAACGCAGAAGGAT 
TCCTTCTAGTGAATCTGTCATCGACTGTGAGCATGCCTCTGTAAATGATGAAGTCAACAT 
GTCTTCGAGAGTGTCTAGATCAAAGGCTCCAGCTCCTCCACCAGAAGAGCCAAAGTTTAC 
ATGTCCAATCTG(^TGTGTCCCTTTACGGAGGAGATGTCAACCAAGTGCGGTCIACATCTT 
CTGCAAGGGATGTATAAAGATGGCAATATCTCGCCAGGGCAAATGCCCTACTTGTAGGAA 
AAAGGTTACTGCAAAAGAGCTGATTCGAGTTTTCCTTCCAACCACTAGATGAGTGGTCCG 
GCAACATCACCAGCCACCCTGTCTAATGGTTTA^ 

ACATTGAAGGGACTTCGTTGACTTGGTATTTTTGAATATTTTGCTTTGTTGGAAGAGAAA 
TATTCAGTGATCAAGAAGCCAGAAGGCCCTATCATTCGATGGATATCATTGGTAATAACT 
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CTTTGTTTTTAGTTGTTGTTCTATGTAATTTA 

CTTCTCTCTTGATAGATGATAAGATATATGGAAAAAAAAATTAATATTGAATCTTTACT 
AAA 

>G898 Amino Acid Sequence (domain in AA coordinates: 148-185) 

MNTSEVRVPRGNRRRKAVIDLNAVPVDQEGTSASVRTLTVPITPSQPAPTMIDVDAIEDD 

VIESSASAFAEAKSKSRNARRRPLMVDVESGGTTRFPANISNKRRRIPSSESVIDCEHAS 

VNDEVNMSSRVSRSKAPAPPPEEPKFTCPICMCPFTEEMSTKCGHIFCKGCIKMAISRQG 

KCPTCRKKVTAKELIRVFLPTTR* 

>G900 (1..648) 

ATGGGGAAGAAGAAGTGCGAGTTATGTTGTGGTGTAGCGAGAATGTATTGTGAGTCAGAT 
CAAGCGAGTTTATGTTGGGATTGTGACGGTAAAGTTCACGGAGCTAATTTTCTGGTGGCG 
AAACACATGCGTTGTCTTCTATGTAGCGCGTGTCAGTCACACACGCCTTGGAAAGCTTCT 
GGGCTGAATCTTGGCCCAACTGTTTCTATCTGTGAGTCTTGTTTAGCTCGTAAGAAGAAT 
AACAACAGCTCCCTCGCCGGGAGGGATCAGAATCTTAACCAAGAAGAAGAGATCATTGGT 
TGTAACGACGGAGCTGAGTCTTATGATGAGGAAAGCGATGAGGATGAAGAAGAAGAAGAA 
GTGGAGAATCAGGTTGTTCCGGCTGCGGTGGAGCAAGAACTTCCGGTGGTGAGTTCGTCG 
TCTTCGGTTAGTAGTGGTGAAGGAGATCAGGTGGTGAAAAGGACGAGACTTGATTTGGAT 
CTTAACCTCTCCGATGAGGAGAACCAATCTAGACCATTGAAAAGATTATCGAGAGACGAA 
GGTTTGTCAAGATCAACTGTTGTGATGAATAGCTCAATCGTGAAATTACACGGAGGGAGG 
AGAAAAGGAGAGGGATGTGATACATCATCGTCGTCTTCGTTTTATTGA 

>G900 Amino Acid Sequence (domain in AA coordinates: 6-28, 48-74) 

MGKKKCELCCGVARMYCESDQASLCWDCDGKVHGANFLVAKHMRCLLCSACQSHTPWKAS 

GLNLGPWSICESCLARKKNl^SSLAGRDQNLNQEEEIIGCNDGAESYDEESDEDEEEEE 

VENQWPAAVEQELPWSSSSSVSSGEGDQWKRTRLDLDLNLSDEENQSRPLKRLSRDE 

GLSRSTWMNSSIVKLHGGRRKAEGCDTSSSSSFY* 

>G913 (108.. 806) 

CATTCAAAAACATCATATATATACACAAACACACTTTGATACAAC^ 
ACAAACAAAAACACATTGTAACATTAGTTTAAGC^ 

ATAATTCTCCGACCACCGTGAATCAAGAAACGACGACGTCTCGTGAAGTCTCAATCACAT 

TGCCTACTGATCAATCTCCTC^^CCTCACCAGGATCATCTTCTTCTCCTTCACCGAGAC 

CTTCCGGTGGATCACCGGCGAGAAGAACGGCGACTGGATTATCCGGCAAGCACTCTATTT 

TCAGGGGGATTCGACTACGTAACGGAAAATGGGTATCGGAGATTAGAGAGCCACGTAAAA 

CGACAAGAATTTGGCTCGGGACTTATCCGGTACCGGAGATGGCTGCCGCCGCTTACGACG 

TGGCTGCGTTAGCTTTAAAAGGACCCGACGCCGTTTTGAATTTTCCTGGTTTAGCTTTGA 

CTTACGTGGCTCCGGTTTCAAACTCTGCTGCGGATATAAGAGCGGCTGCTAGTAGAGCAG 

CGGAGATGAAGCAACCGGATCAGGGTGGGGATGAGAAGGTATTGGAACCGGTTCAACCCG 

GCAAAGAGGAAGAATTAGAAGAAGTGTCGTGTAACTCGTGTTCGTTGGAGTTTATGGATG 

AGGAAGCGATGTTGAATATGCCGACTTTGTTGACGGAGATGGCTGAAGGGATGTTGATGA 

GTCCACCGAGAATGATGATACATCCGACGATGGAAGATGATTCGCCGGAGAATCATGAAG 

GAGATAATCTTTGGAGTTATAAATGAATCCATTGAAGCTGCTCTCTTTTTTATTGTTTTC 

CGGTCGAATGAGATTTTCCCCCTTTTTTTTTTTCTTTTTGGGTCGCTGTT 

>G913 Amino Acid Sequence (domain in AA coordinates: 62-128) 

MSNNNNSPTTVNQETTTSREVSITLPTDQSPQTSPGSSSSPSPRPSGGSPARRTATGLSG 

KHSIFRGIRLRNGKWSEIREPRKTTRIWLGTYPVPEMAAAAYDVAALALKGPDAVLNFP 

GLALTYVAPVSNSAADIRAAASRAAEMKQPDQGGDEKVLEPVQPGKEEELEEVSCNSCSL 

EFt©EEAMLNMPTLLTEMAEGMMSPPRMMIHPTMEDDSPEl^EGDNIjWSyK* 

>G937 (45.. 1046) 

TGGAAAAAGTTTGA^TTTTTAATTCGAATCGAGAAAAAATAAAAATGGGTTCTTTAGGTG 
ATGAGCTTAGTTTGGGATCGATCTTTGGGAGAGGAGTTTCGATGAATGTTGTGGCGGTTG 
AGAAAGTTGATGAACATGTTAAGAAGCTTGAAGAAGAGAAGAGAAAGCTCGAAAGTTGTC 
AACTTGAGCTTCCTCTGTCTTTGCAGATTTTAAACGATGCGATTTTGTATCTGAAGGATA 
AGAGATGTTCAGAGATGGAGACTCAACCATTGTTGAAAGATTTCATTTCTGTTAATAAAC 
CTATTCAAGGAGAAAGAGGAATAGAATTGCTGAAAAGAGAGGAGCTAATGAGGGAGAAGA 
AGTTTCAGCAATGGAAAGCTAATGATGATCACACTAGTAAGATCAAGAGCAAGCTTGAGA 
TTAAGAGAAATGAGGAGAAATCTCCTATGTTGTTGATTCCAAAGGTGGAAACTGGTTTAG 
GCCTCGGTTTAAGTTCGAGTTCGATAAGAAGAAAAGGGATTGTTGCCT(^TGTGGCTTTA 
CTTCTAACTCTATGCCACAACCACCAACACCAGCAGTACCACAACAACCAGCATTTCTTA 
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AGCAGO^GCTTTACGGAAGCAAAGAAGGTGTTGGAATCCAGAGTTGCATCGCCGATTT^ 
TCGATGCATTGCAACAGCTAGGTGGACCGGGAGTGGCAACTCCTAAACAAATTAGAGAAC 
ATATGCAAGAAGAAGGCTTAACCAATGATGAAGTCAAGAGTCATTTACAGAAATACAGGT 
TACACATCAGGAAGCCAAATTCGAATGCGGAGAAACAATCAGCAGTTGTTTTAGGG1TTA 
ACTTGTG GAATTCTTCAGCACAAGATGAAGAAGAGACATGTGAAG GAGGAGAATCATTGA 
AGAGAAGCAATGCGCAATCAGATTCTCCTCAAGGTCCTTTGCAGTTACCGTCTACAACAA 
CAACAACTGGTGGAGATAGTAGCATGGAAGATGTTGAAGATGCTAAGTCTGAGAGCTTTC 
AACTGGAGAGATTGAGATCACCATAAATCTCAAGAAACCAAACTCTTGATCACGGTTTTG 
TTATTTTGGATTCATTACTATATCTATTAGTA^ 

TTTATAGATATATATATAGAGAAAAAGAGAGAGTGAGGATGGTTCAAATTATTTGCAGA 

>G937 Amino Acid Sequence (conserved domain in AA coordinates : 197-246) 

MGSLGDELSLGSIFGRGVSMNWAVEKVDEHVTCKL^ 

LYLKDKRCSEMETQPLLKDFISWKPIQGERGIELIiKREELMREKKFQQWKANDDHTSKI 
KSKLEIKRNEEKSPMLLIPKVETGLGLGLSSSSIRRKGIVASCGFTSNSMPQPPTPAVPQ 
QPAFLKQQALRKQRRCWNPELHRRFVDALQQLGGPGVATP^ 

LQKYRLHIRKPNSNAEKQSAVVLGFNIiWNSSAQDEEETCEGGESLKRSNAQSDSPQGPLQ 

LPSTTTTTGGDSSMEDVEDAKSESFQLERLRSP* 

>G960 (63. .1538) 

TACCGTCGACCCACGCGTCCGAGTGTATTCAAAGTCGGAAAGAAACCCTAAAGAAGAGGA 
TTATGGGTGCTGTATCGATGGAGTCGCTTCCTTTAGGTTTCAGATTCAGACCTACCGATG 
AAGAGCTCGTCAATCACTACCTCCGTCTCAAGATCAACGGACGTCACTCCGATGTCCGTG 
TCATCCCTGATATCGATGTCTGCAAATGGGAACCTTGGGATCTTCCTGCTCTCTCGGTGA 
TTAAGACGGATGATCCAGAGTGGTTCTTTTTCTGCCCTCGTGATCGGAAATACCCTAATG 
GTCATCGCTCTAACAGAGCAACTGACTCTGGCTATTGGAAAGCTACTGGTAAAGATCGTA 
GCATCAAGTCTAAGAAGACTTTAATCGGTATGAAGAAGACTCTTGTCTTCTATCGTGGAC 
GAGCTCCTAAAGGTGAGCGGACTAATTGGATTATGCACGAGTATCGTCCCACTCTTAAGG 
ATCTTGATGGC^CTTCCCCTGGCCAAAGCCCTTACGTTCTTTGTCGCCTCTTCCACAAGC 
CTGATGATCGGGTTAATGGTGTCAAGTCCGATGAAGCAGCTTTTACGGCCAGCAACAAAT 
ACTCACCTGATGATACATCATCTGATCTTGTTCAAGAAACACCTTCCTCTGATGCTGCTG 
TTGAGAAACCATCAGATTATTCAGGTGGATGCGGTTATGCTCATAGTAATAGTACCGCAG 
ATGGGACAATGATTGAGGCACCTGAAGAGAATCTTTGGTTATCTTGTGACCTTGAAGATC 
AAAAGGCACCACTACCGTGTATGGATTCTATATATGCTGGTGATTTCAGTTACGATGAGA 
TTGGATTCCAATTTCAAGATGGTACCAGCGAACCAGATGTATCACTAACAGAATTGTTGG 
AGGAGGTGTTCAATAACCCTGATGACTTCTCTTGCGAGGAATCGATCAGTCGAGAGAATC 
CAGCAGTCTCACCAAATGGGATATTTTCATCTGCTAAAATGCTGCAGTCTGCAGCACCAG 
AGGATGCTTTCTTCAACGACTTCATGGCTTTCACTGATACAGATGCTGAGATGGCGCAAT 
TGCAGTATGGTTCAGAAGGTGGAGCTTCTGGTTGGCCAAGTGACACTAATTCATACTATA 
GTGATTTGGTTCAGCAAGAGCAAATGATCAATCATAACACAGAGAACAACCTCACAGAAG 
GGAGAGGGATAAAGATCCGGGCTCGACAGCCTCAGAACCGGCAGAGTACAGGATTGATAA 
ACCAGGGTATTGCTCCAAGGAGAATCCGTCTGCAGCTGCAGTCTAACTCTGAAGTAAAAG 
AACGAGAGGAGGTGAATGAAGGACACACTGTTATTCCCGAGGCCAAAGAAGCTGCAGCTA 
AATACTCAGAGAAGAGTGGTTCTTTGGTTAAACCTCAAATAAAGCTCAGGGCGCGGGGAA 
CTATAGGCCAAGTAAAAGGAGAGAGATTTGCAGACGACGAGGTACAGGTGCAGAGCACAA 
AGAGAGAGAGAGAGAGAATCAAATGTAGTTTAATGTAATTAGGGATGATGCAATGTTAGC 
ATGTTTGTGTGTTGTAACTTAAAAACTTATTTAGGAATCTGATAAAAGTTACTGTTGAAA 
AAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

>G960 Amino Acid Sequence (domain in AA coordinates: 13-156) 
MGAVSMESLPLGFRFRPTDEELVNHYLRLKINGRHSDVRVIPDIDVCKWEPWDLPALSVI 
KTODPEWFFFCPRDRKYPNGHRSNRATDSGYWKATGKDRS I KSKKTLIGMKKTLVFYRGR 
APKGERTNWIMHEYRPTLKDLDGTSPGQSPYVIjCRLFHK^ 

SPDDTSSDLVQETPSSDAAVEKPSDYSGGCGYAHSNSTADGTMIEAPEENLWLSCDLEDQ 
KAPLPCMDSIYAGDFSYDEIGFQFQDGTSEPDVSLTELLEEVFNNPDDFSCEESISRENP 
AVSPNGIFSSAKMLQSAAPEDAFFNDFMAFTDTO^ 

DLVQQEQMIiniNTENNIiTEGRGIKIRARQPQNRQSTGLINQGIAPRRIRLQLQSNSEVKE 
REEVNEGHTVI PEAKEAAAKYSEKSGSLVKPQI KLRARGTIGQVKGERFADDEVQVQSTK 
RERERIKCSLM* 
>G991 (6.. 533) 
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GAAAAATGGAAGAAGAAAAGAGATTGGAGCTAAGGCTAGCTCCTCCTTGTCACCAATTCA 
CTTCCAACAACAACATCAATGGATCTAAACAAAAAAGCTCGACCAAAGAAACATCATTCC 
TTTCC^TAACAGGGTTGAGGTAGCTCCAGTGGTGGGATGGCCGCCGGTGAGATCATCCC 
GGAGAAACCTAACGGCACAACTAAAGGAGGAGATGAAGAAGAAGGAGAGTGATGAAGAGA 
AGGAATTGTACGTTAAGATCAACATGGAAGGAGTTCCAATAGGAAGAAAAGTCAACCTTT 
CAGCTTATAACAACTACCAACAGCTTTC^CATGCCGTTGACCAACTCTTCTCTAAGAAAG 
ATTCGTGGGATCTAAACAGACAATACACTTTGGTCTACGAAGACACTGAAGGAGATAAAG 
TTCTGGTCGGGGATGTTCCTTGGGAGATGT^ 

TT^GACCTCCCACGCCTTrCTCACTCTC^CCTAGAAAA^TGGC^AGGAATAGAGAGAGG 

TTGGCCAAAATCATCAGTTCGATGGTTTGTTTTTAATGTAATTTTTGTGGAAACTAAT^ 

GGTTTGGCTTTGATTTACTGGTTTTCTTTTTCACTTATGTACTAGGTTTT^^ 

GTTATTTCTTGTTTTGGTTGTAAATATGCTGTTCGTTTAAGAAATCGGGGGTTAGTATGT 

TATCGTGTGTATAAAAATAGTGTAAGCACX3TAAGTTGATTACAAAAAAAAAAAAAAAAAA 

AAAAAAAAA 

>G991 Amino Acid Sequence (domain in AA coordinates: 7-14,48-59,82-115,128-164) 

MEEEKRLELRIAPPCHQFTSNNNINGSKQKSST 

NLTAQLKEEMKKKESDEEKELYVKINMEGVPIGRKWLSAYNOT 

WDLNRQYTLWEDTEGDKVLVGDVPWEMFVSTVKRLHVLKTSHAFS 

>G748 (98.. 1444) 

CCACGCGTCCGCACTCTCCCAAATCTCTCTTCTTTAACAACAAAAAAAAAATCACAGAGA 
CATAGAGAGAAGAAGACGGAACAGAGGCTCCAAAAAAATGATGATGGAGACTAGAGATCC 
AGCTATTAAGCTTTTCGGTATGAAAATCCCTTTTCCGTCGGTTTTTGAATCGGCAGTTAC 
GGTGGAGGATGACGAAGAAGATGACTGGAGCGGCGGAGATGACAAATCACCAGAGAAGGT 
AACTCCAGAGTTATCAGATAAGAACAACAACAACTGTAACGACAACAGTTTTAACAATTC 
GAAACCCGAAACCTTGGACAAAGAGGAAGCGACATCAACTGATCAGATAGAGAGTAGTGA 
CACGCCTGAGGATAATCAGCAGACGACACCTGATGGTAAAACCCTAAAGAAACCGACTAA 
GATTCTACCGTGTCCGAGATGCAAAAGCATGGAGACCAAGTTCTGTTATTACAACAACTA 
CAACATAAACCAGCCTCGTCATTTCTGCAAGGCTTGTCAGAGATATTGGACTGCTGGAGG 
GACTATGAGGAATGTTCCTGTGGGGGCAGGACGTCGTAAGAACAAAAGCTCATCTTCTCA 
TTACCGTCACATCACTATTTCCGAGGCTCTTGAGGCTGCGAGGCTTGACCCGGGCTTACA 
GGCAAACACAAGGGTCTTGAGTTTTGGTCTCGAAGCTCAGCAGCAGCACGTTGCTGCTCC 
CATGACACCTGTTATGAAGCTACAAGAAGATCAAAAGGTCTCAAACGGTGCTAGGAACAG 
GTTTCACGGGTTAGCGGATCAACGGCTTGTAGCTCGGGTAGAGAATGGAGATGATTGCTC 
AAGCGGATCCTCTGTGACCACCTCTAACAATCACTCAGTGGATGAATCAAGAGCACAAAG 
CGGCAGTGTTGTTGAAGCACAAATGAACAACAACAACAACAATAACATGAATGGTTATGC 
TTGCATCCCAGGTGTTCCATGGCCTTACACGTGGAATCCAGCGATGCCTCCACCAGGTTT 
TTACCCGCCTCCAGGGTATCCAATGCCGTTTTACCCTTACTGGACCATCCCAATGCTACC 
ACCGCATCAATCCTCATCGCCTATAAGCCAAAAGTGTTCAAATACAAACTCTCCGACTCT 
CGGAAAGCATCCGAGAGATGAAGGATCATCGAAAAAGGACAATGAGACAGAGCGAAAACA 
GAAGGCCGGGTGCGTTCTGGTCCCGAAAACGTTGAGAATAGATGATCCTAACGAAGCAGC 
AAAGAGCTCGATATGGACAACATTGGGAATCAAGAACGAGGCGATGTGCAAAGCCGGTGG 
TATGTTCAAAGGGTTTGATCATAAGACAAAGATGTATAACAACGACAAAGCTGAGAACTC 
CCCTGTTCTTTCTGCTAACCCTGCTGCTCTATCAAGATCACACAATTTCCATGAACAGAT 
TTAGAGTTACATATGTATATGTATATATGTATGATTGATTGTATGTATAGATGATACTGG 
AGAATGATGAGTTTTTGAGAATCAAACTCTTTTCTTCTTTCTAGTGATTGCCTTTATTCC 
TTTACATGTTTTGGTTCTCTGTACACTATTTGATTTACCTTTTTTACTTTCTTTCTTCAT 
TTGTCAGGAAATGTTGGAAGATAACATTAATGGTAAAAAGTTGGTGTGGACCGTTGTTGC 
GTTGGCATTTCAAAAAAAAAAAAAAAA 

>G748 Amino Acid Sequence (domain in AA coordinates: 112-140) 
MMMETRDPAIKLFGMKIPFPSVFESAVTVEDDEEDDWS 

NDNSF^SKPETLDKEEATSTDQIESSDTPEDNQQTTPDGKTLKKPTKILPCPRCKSMET 
KFCYY1MYNINQPRHFCKACQRYWTAGGTMRNVPVGAGRRKNKSSSSHYRHITISEALEA 
ARLDPGLQANTRVLSFGLEAQQQHVAAPMTPVMKLQEDQKVSNGAI^RFHGIjADQRLV^ 
VENGDDCSSGSSVTTSNtniSVDESRAQSGSVVEAQMNNNl^^ 

PAMPPPGFYPPPGYPMPFYPYWTIPMLPPHQSSSPISQKCSNTNSPTLGKHPRDEGSSKK 
DNETERKQKAGCVLVPKTLRIDDPNEAAKSS I WTTLG I KNEAMCKAGGMFKGFDHKTKM Y 
NNDKAENS PVLS ANPAALSRSHNFHEQI * 
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>G247 (1..660) 

ATGAGAATGACAAGAGATGGAAAAGAACATGAATACAAGAAAGGTTTATGGACAGTGGAA 

GAAGACAAGATCCTCATGGATTATGTCCGAACTCATGGCCAGGGCCACTGGAACCGCATC 

GCCAAGAAAACTGGGCTCAAGAGATGTGGGAAAAGCTGTAGGTTGAGATGGATGAACTAC 

TTAAGCCCTAATGTTAACAGAGGCAATTTTACTGACCAAGAAGAAGATCTCATCATC^ 

CTCCACAAGCTCCTCGGCAACAGATGGTCGTTGATAGCGAAAAGAGTTCCGGGAAGAACA 

GACAACCAAGTAAAGAATTACTGGAACACACATCTCAGCAAGAAACTTGGTCTCGGAGAT 

C^TTCAACTGCCGTCAAAGCCGCATGCGGTGTAGAGTCTCCACCGTCTATGGCCCTTATA 

ACCACAACGTCCTCCTCTCATCAAGAGATCTCCGGTGGAAAAAATTCAACTCTAAGGTTC 

gacactttagttgacgaatccaaactcaaaccaaaatcct^aactagtccacgcaacacca 
actgacgtagaagttgcagctacggttccaaAtctgttcgataccttttgggttcttgaa 

GACGACTTCGAGCTTAGTTCACTC^CTATGATGGATTTTACTAATGGGTATTGCCTTTGA 

>G247 Amino Acid Sequence (domain in AA coordinates: 15-116) 

MRMTRDGKEHEYKKGLWTVEEDKIIM 

LSPNVNRGNFTDQEEDLIIRLHKLLGNRWSLIAKRW 

HSTAVKAACGVESPPSMALITTTSSSHQEISGGKNSTLRFDTLVDESK^ 

TDVEVAATVPNLFDTFWVLEDDFELS SLTMMDFTNGYCL * 

>G585 (111.. 2039) 

CTCTCAAACATTTCTCTGTTTGTTCCGGCGAAAACGGCAACTGTTTCATCAAATGACAAA 

CACAAAAACCTTAACATCTAGTTTGTATCCTCTCTGATACTTCAAAAAAAATGGATGAAG 

AAACAATGGCTACCGGACAAAACAGAACAACTGTGCCAGAGAATCTGAAGAAACACCTCG 

(^GTTTCAGTTCGAAACATTCAATGGAGTTATGGTATCTTTTGGTCTGTCTCTGCTTCTC 

AGTCTGGAGTTTTAGAATGGGGAGATGGATACTATAATGGAGATATCAAAACGAGGAAGA 

CGATTCAAGCTTCGGAGATCAAAGCTGATCAGCTTGGTCTACGGAGGAGCGAG CAG CTTA 

GCGAGCTTTACGAGTCTCTCTCCGTCGCTGAATCTTCTTCTTCAGGCGTTGCTGCCGGAT 

CTCAAGTCACCAGACGAGCTTCCGCCGCCGCACTTTCACCGGAAGATCTCGCCGACACCG 

AGTGGTACTATTTGGTTTGTATGTCTTTCGTCTTCAACATTGGTGAAGGAATGCCTGGAC 

GGACGTTTGCAAACGGTGAACCGATATGGTTGTGCAACGCTCATACGGCGGATAGTAAAG 

TGTTTAGCCGTTCTCTTCTAGCAAAAAGTGCTGCGGTTAAGACAGTGGTTTGCTTCCCGT 

TCCTTGGAGGAGTCGTTGAGATTGGTACCACAGAACATATTACGGAAGACATGAATGTAA 

TACAATGCGTGAAGACATCATTCCTCGAAGCCCCTGATCCGTACGCTACAATATTACCAG 

CAAGATCCGATTATCACATCGACAACGTTCTTGATCCGCAACAGATTCTAGGCGACGAGA 

TTTACGCGCCTATGTTCAGTACGGAGCCTTTTCCAACAGCTTCTCCGAGCAGAACTACCA 

ACGGTTTCGATCAAGAACATGAACAAGTAGCAGATGATCATGATTCTTTCATGACCGAAA 

GAATCACTGGAGGAGCTTCTCAGGTGCAAAGCTGGCAGCTCATGGACGACGAGCTTAGTA 

ACTGCGTTCACCAGTCGCTAAATTCCAGCGATTGCGTCTCTCAAACGTTTGTTGAAGGGG 

CGGCTGGACGGGTTGCTTACGGTGCAAGAAAGAGTAGAGTTCAAAGACTAGGGCAAATTC 

AAGAGCAACAGAGAAATGTGAAGACATTGTCATTTGATCCAAGAAACGACGACGTTCATT 

ACCAAAGTGTGATCTCAACGATTTTTAAGACCAACCATCAGTTAATTCTCGGACCGCAGT 

TTCGAAACTGCGATAAACAGTCAAGCTTCACTAGGTGGAAGAAATCATCGTCATCATCAT 

CAGGAACCGCCACGGTCACGGCACCATCACAAGGAATGTTAAAGAAAATTATTTTCGATG 

TTCCGCGAGTGCACCAGAAAGAGAAGTTAATGTTGGACTCACCAGAAGCCAGAGATGAAA 

CTGGGAACCATGCGGTTTTAGAGAAGAAGCGCCGCGAGAAATTGAACGAACGGTTCATGA 

CCTTGAGAAAAATCATTCCGTCAATCAACAAGATCGATAAAGTATCGATTCTTGACGATA 

CGATAGAGTATCTTCAAGAACTCGAGAGACGGGTTCAAGAACTAGAATCTTGCAGAGAAT 

CAACCGATACAGAGACTCGTGGGACGATGACGATGAAGAGGAAGAAACCATGCGACGCAG 

GAGAAAGAACATCAGCTAATTGCGCAAATAATGAAACAGGAAATGGGAAGAAGGTGTCGG 

TTAACAATGTTGGT-GAAGCCGAGCCAGCAGATACCGGTTTTACTGGTTTAACCGATAATT 

TAAGGATCGGTTCGTTTGGTAATGAGGTGGTTATTGAGCTTAGATGTGCTTGGAGAGAAG 

GAGTATTGCTTGAGATAATGGATGTGATTAGTGATCTCCATTTGGATTCTCATTCGGTTC 

AATCCTCGACCGGAGACGGTTTGCTCTGCTTAACCGTCAATTGCAAGCACAAGGGGTCAA 

AAATAGCGACACCAGGAATGATCAAAGAAGCACTTCAAAGGGTTGCATGGATCTGTTGAA 

GACTACTTAGTTAAAATTGACAGCAAAGAAAAAACATTCCCGGTTTGGTTTCTATTCTTT 

GGTTTTCTTCTAACCGGGTTTTAGGAATTAATGTTATGTTTATGATTTGTTT^ 

TTTTTTGTGTCTTTTTTTCCGTTGCTTAACGTAGGTGAAGAGGAACATACACTATGCGTA 

TTTTGTTTGAGGTAGATTATTTTAAGGGTATTAGTAATAGTAATAGCCAGTTTAGATGAT 

TTTGTGTTCTTTTGTTGTT 
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>G585 Amino Acid Sequence (domain in AA coordinates : 436-501) 

MDBETMATGQNRTTVPENLKKHIAVSVimiQWSYGIFWSVSASQSGVLEW 

TRKTIQASEIKADQLGLRRSEQLSELYESLSVAESSSSGVAAGSQVTRRASAAALSPEDL 

ADTEWYYLVCMSFVFNIGEGMPGRTFANGEPIWLC^AHTADSKVFSRSLI^UCSAAVKTW 

CFPFLGGVVEIGTTEHITEDMNVIQCVKTSFLEAPDPYATILPARSDYHIDNVLDPQQIL 

GDE I YAPMFSTEP F PTASP SRTTNGFDQEHEQVADDHDS FMTERITGGASQVQSWQLMDD 

ELSNC^HQSLNSSDCVSQTFVEGAAGRVAYGARKSRVQRLGQIQEQQRNVKTLSFDPRND 

DVHYQSVISTIFKTl^QLILGPQFRNCDKQSSFTRWKKSSSSSSGTATVTAPSQGMLKKI 

I FDVPRVHQKEKLMLDS PEARDETGNHAVLE KKRREKLNERFMTLRKI I PS INKIDKVS I 

LDDTIEYLQELERRVQELESCRESTDTETRGTMTOKRKKPCDAGERTSANCANNETGNGK 

KVSVNtTTOEAEPADTGFTGLTDNLRIGSFGNEW 
HSVQSSTGDGLLCLTVNCKHKGSKIATPGMIKEALQRVAWIC* 

>G634 (1..798) 

ATGGAGCAAGGAGGAGGTGGTGGTGGTAATGAAGTTGTGGAGGAAGCTTCACCTATTAGT 
TCAAGACCTCCTGCTAACAACTTAGAAGAGCTTATGAGATTCTCAGCCGCCGCGGATGAC 
GGTGGATTAGGAGGTGGAGGTGGAGGAGGAGGAGGAGGAAGTGCTTCTTCTTCATCGGGA 
AATCGATGGCCGAGAGAAGAAACTTTAGCTCTTCTTCGGATCCGATCCGATATGGATTCT 
ACT1^TCGTGATGCTACTCTCAAAGCTCCTCTTTGGGAACATGT?TTCCAGGAAGCTATTG 
GAGTTAGGTTACAAACGAAGTTCAAAGAAATGCAAAGAGAAATTCGAAAACGTTCAGAAA 
TATTACAAACGTACTAAAGAAACTCGCGGTGGTCGTCATGATGGTAAAGCTTACAAGTTC 
TTCTCTCAGCTTGAAGCTCTCAACACTACTCCTCCTCCTCCTCCTTCTCATCCTCACGCT 
CATCAACCAGAACAGAAAC^CAACAAC^C(^CAACAAGAGATGGTCATGAGCTCGGAA 
CAATCATCATTACCATCATCATCAAGATGGCCAAAGGCAGAGATTCTAGCGCTTATT^C 
CTGAGAAGTGGAATGGAACCAAGGTACCAAGATAATGTACCTAAAGGACTTCTATGGGAA 
GAGATCTCAACTTCAATGAAGAGAATGGGATACAACAGAAACGCTAAGAGATGTAAAGAG 
AAATGGGAAAACATAAACAAATACTACAAGAAAGTTAAAGAAAGCAACAACAGCAACTAC 

AACAACAAGAATCAATGA 

>G634 Amino Acid Sequence (domain in aa coordinates: 62-147, 189-245) 
MEQGGGGGGNEVVEEASPISSRPPANNLEELMRPS 

NRWPREETIiALLRIRSDMDSTFRDATLKAPLWEHVSRKLLELGYKRSSKKCKEKFENVQK 

YYKRTKETRGGRHDGKAYKFFSQLEALNTTPPPPPSHPHAHQPEQKQQQQPQQEMVMSSE 

QSSLPSSSRWPKAEILALINLRSGMEPRYQDNVPKGLLWEEISTSMKRMGYWRNAKRCKE 

KWENINKYYKKVKESNNSNYNNKNQ* 

>G676 (1..612) 

atgagaaagaaagtaagtagtagtggtgacgaaggaaacaatgagtacaagaaaggtttg 
tggacagtagaagaagacaaaatcctcatggattatgtcaaagctcatggcaaaggtcac 
tggaatcgtattgccaaaaagactggtttaaagagatgtggaaagagttgtagattgagg 
tggatgaattatctcagccctaatgtgaaaagaggcaatttcaccgagcaagaagaggat 
cttatcattaggctccacaagttgcttggtaataggtggtctttaattgctaaaagagtg 
ccgggtcgaacggataatcaagtgaagaactattggaacacgcatcttagtaagaaactc 
ggaatcaaagatcagaaaaccaaacagagcaatggtgatattgtttatcaaatcaatctc 
ccgaatcctaccgaaacatcagaagaaacgaaaatctcgaatattgtcgataacaataat 
atcctcggagatgaaattcaagaagatcatcaaggaagtaactacttgagttcactttgg 
gttcatgaggatgagtttgagcttagcacactcaccaacatgatggactttatagatgga 

cactgtttttga 

>G676 Amino Acid Sequence (domain in AA coordinates: 17-119) 

mrkkvs s s gdegnl^ ykkgl wtveedkilmd yvkahgkghwnri akktglkrcgks crlr 
wmnylspi^krgnfteqeedliirlhkllgnrwslia^ 

GIKDQKTKQSNGDIVYQINLPNPTETSEETKISNIVDNNNILGDEIQEDHQGSNYLSSLW 

VHEDEFELSTLTNMMDFIDGHCF* 

>G682 (1..228) 

ATGGATAACCATCGCAGGACTAAGCAACCCAAGACCAACTCCATCGTTACTTCTTCTTCT 

GAAGAAGTGAGTAGTCTTGAGTGGGAAGTTGTGAACATGAGTCAAGAAGAAGAAGATTTG 

GTCTCTCGAATGCATAAGCTTGTCGGTGACAGGTGGGAACTGATAGCTGGGAGGATCCCA 

GGAAGAACCGCTGGAGAAATTGAGAGGTTTTGGGTCATGAAAAATTGA 

>G682 Amino Acid Sequence (domain in AA coordinates 27-63) 

MDNHRRTKQPKTNS I VTS S SEEVS SLEWEWNMS QEEEDLVSRMHKLVGDRWELI AGR I P 



187 



WO 03/013227 



188/286 



PCT/US02/25805 



GRTAGEIERFWVMKN* 
>G635 (1..993) 

ATGGAGATCATGCGTCCAGGGGTCTCAGAAAACACTTTGAAAGGAAAAATAAGAATCACA 
ACGCGGTGCATGTGGCTTGAC7U^GGAAGACTTTTAGATGCACTTCACAAAGCAGCTCAT 
GCTGCTCTATCAAGTTGTCCTGTGACATGTCCCTTGTCTCACATGGAAAGAACAGTCTCC 
GAAGTCCTGAGGAAGATTGTAAGGAAGTACAGTGGTAAAAGGCCTGAAGTCATCGCTATA 
GCCACTGAGAATCC^^TGGCTGTCCGAGCTGATGAGGTCAGTGCGAGACTGTCrrGGTGAT 
CCAAGTGTTGGTTCTGGAGTTGCAGCTTTAAGGAAAGTTGTTGAAGGAAATGACAAAAGA 
AGTCGGGCGAAGAAAGCACCTTCACAAGAAGCTTCCCCCAAAGAAGTAGATCGCACTTTG 
GAAGATGATATCATTGATAGTGCAAGACTACTGGCTGAAGAAGAAACTGCGGCATCAACA 
TACACGGAAGAAGTTGATACGCCCGTTGGGAGTTCTTCAGAAGAGTCAGACGATTTTTGG 
AAATCATTCATC^TCCATCATCGTCACCTTCACCGAGTGAAACAGAAAATATGAATAAG 
GTAGCTGATACGGAGCCTAAAGCAGAGGGTAAGGAAAACAGCAGAGACGACGATGAATTA 
GCTGATGCTTCAGATTCTGAAACCAAGTCATCACCAAAACGTGTGAGGAAGAACAAATGG 
AAACCGGAGGAGATAAAGAAGGTAATCAGAATGCGAGGAGAGCTGCACAGTAGATTTCAA 
GTGGTGAAAGGTAGAATGGCATTGTGGGAAGAGATCTCTTCAAATCTATCAGCTGAAGGA 
ATCAATCGAAGCCCGGGACAATGCAAATCTCTCTGGGCATCACTTATTCAGAAATACGAG 
GAGAGCAAGGCTGATGAGAGAAGCAAGACGAGTTGGCCACATTTTGAGGATATGAACAAC 
ATTTTGTCAGAGCTAGGCACACCTGCGTCTTAA 

>G635 Amino Acid Sequence (domain in AA coordinates: 239-323) 

MEIMRPGVSENTLKGKIRITTRCMWLDKGRLLDALHKAAHAALSSCPVTCPLSHMERTVS 

EVLRKIVRKYSGKRPEVIAIATENPMAVRADEVSARLSGDPSVGSGVAALRKVVEGNDKR 

SRAKKAPSQEASPKEVDRTLEDDIIDSARLLAEEETAASTYTEEVDTPVGSSSEESDDFW 

KSFINPSSSPSPSETENWNK^ADTEPKT^GKENSRDDDELADASDSETKSSPKRWKNK^ 

KPEEIKKVIRMRGELHSRFQVVKGRMALWEEISSNLSAEGINRSPGQCKSLWASLIQKYE 

ESKADERSKTSWPHFEDMNNIIiSELGTPAS* 

>G1068 (150. .1310) 

GAGAGTTGTTAGCTAGCTC^C^CGCTTTCGCTTAAAACTCAAAAACCTGCACTTTCTCGT 
CTATTTTCTCGGCATTCGTAAAACAGAAAAGTGGGTCTCCAAGAAAATTACCCTAAATTC 
ACAAAGATTCATACTTTTCTCCACCTCCAATGGATTCCAGAGAGATCCACCACCAACAAC 
AGCAAC^ACAACAACAACAACAGCAGCAGCAGCAACAACAGCAACATCTACAACAACAGC 
AACAACCACCGCCAGGGATGTTAATGAGTCACCACAATTCCTACAATCGAAACCCTAACG 
CCGCCGCCGCTGTTTTAATGGGTCACAACACCTCCACATCTCAAGCTATGCATCAAAGAT 
TACCTTTTGGTGGTTCTATGTCACCGCATCAGCCTCAACAACATCAGTATCATCATCCTC 
AGCCTCAGCAACAGATAGATCAGAAGACTCTTGAATCTCTTGGATTTCCTACTTCGCCTC 
TTCCTTCTGCTTCTAATTCTTACGGTGGTGGAAATGAAGGAGGTGGTGGTGGTGATAGCG 
CCGGAGCTAATGCTAACTCTTCCGATCCACCTGCTAAACGGAACAGAGGACGTCCTCCTG 
GCTCCGGTAAGAAGCAGCTCGATGCTTTAGGAGGAACAGGAGGAGTTGGGTTCACGCCTC 
ATGTCATTGAGGTTAAAACAGGAGAGGACATAGCTACGAAGATATTGGCGTTTACGAACC 
AAGGGCCACGCGCAATCTGTATTCTCTCAGCTACAGGAGCTGTAACTAATGTGATGCTTC 
GTCAAGCTAACAATAGCAATCCTACTGGAACTGITAAGTATGAGGGCCGATTTGAAATCA 
TTTCTCTGTCAGGTTCTTTCTTGAATTCTGAGAGTAATGGTACTGTGACCAAAACTGGTA 
ACTTGAGTGTGTCGCTGGCTGGACACGAAGGCCGGATTGTGGGTGGATGTGTTGATGGAA 
TGCTAGTAGCTGGATCACAAGTCCAGGTCATTGTGGGAAGCTTTGTACCAGATGGAAGGA 
AGCAGAAACAAAGTGCGGGGCGTGCTCAGAATACTCCGGAGCCAGCTTCAGCACCAGCCA 
ATATGTTGAGCTTTGGTGGTGTTGGTGGACCGGGAAGCCCTCGATCTCT^AGGACAACAAC 
ACTCGAGCGAGTCATC^GAGGAAAACGAAAGTAATTCTCCGTTGCACCGTAGAAGCAACA 
ACAACAACAGCAACAATCATGGGATATTTGGAAACTCTACACCTCAACCGCTTCACCAAA 
TTCCTATGCAGATGTACCAGAATCTCTGGCCTGGCAACAGTCCTCAATAAACAGATGGTT 
CATGGGTGAAGATTTGACCGGGTTTGCTTCTCTC 

ATTTATCTCTATAAAGTAGATTGAGCTCTCTTACTCTCTCATCTTCTTCTCCTTTACTAT 

TTCTCTTAAATTTAGCTTTGGTTTTAGATAAATAGAGAGAGAGAGACATGTTAAGTAGGT 

TTCAAATTCAATCTTGTTTAGTTTGTTTCTTA 

AAGACTTGTTCTTTTTCTCCTATATTCAACGAATTATCCACTTTAA 

>G1068 Amino Acid Sequence (domain in AA coordinates: 143-150) 

^SREIHHQQQQQQQQWQQQQQQQHLQQQQQPPPGMLMSHHNSYNRNPNAAAAVLMG^ 

TSTSQAMHQRLPFGGSMSPHQPQQHQYHHPQPQQQIDQKTLESLGFPTSPLPSASNSYGG 
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GNEGGGGGDSAGANANSSDPPAKRNRGRPPGSGKKQLDALGGTGGVGFTPHVIEVKTGED 
I ATKI LAFTNQGPRAI C I LS ATG AVTNVMLRQANNSNPTGTVKYEGRFE 1 1 SLSGS FLNS 
ESNGTVTKTGNLSVSLAGHEGRIVGGCVDGMLVAGSQVQVIVGSFVPDGRKQKQSAGRAQ 
NTPEPASAPANMLSFGGVGGPGSPRSQGQQHSSESSEENESNSPLHRRSNNNNSNNHGIF 
GNSTPQPLHQ I PMQMYQNLW PGNS PQ * 
>G1225 (1..984) 

ATGACTCTAGAAGCTTTATCATCAAACGGTCTTTTAAACTTTTTGCTCTCTGAAACTCTT 
TCACCAACTCCATTCAAGTCTCTCGTCGATCTCGAGCCATTGCCGGAAAATGATGTCATC 
ATATCGAAGAACACAATTTCGGAGATATCTAATCAAGAACCGCCACCACAGCGACAACCA 
CC^GCTACGAATCGAGGGAAGAAGCGGCGGAGGAGGAAGCCTAGGGTTTGCAAAAACGAG 
GAAGAAGCTGAGAATCAACGAATGACTCACATTGCCGTCGAAAGAAATCGAAGAAGACAA 
ATGAATCAACATCTCTCTGTCTTGCGATCTCTCATGCCTCAACCTTTTGCTCACAAGGGT 
GATCAAGCTTCAATAGTTGGTGGAGCCATAGATTTCATCAAAGAACTTGAACACAAATTA 
CTATCTCTTGAAGCTCAAAAACATCATAATGCT 

ACAAGTCAAGACTCAAATGGTGAACAAGAGAATCCTCATCAACCATCTTCACTATCTCTA 
TCGCAGTTCTTTCTTCATTCATACGATCCGAGCCAAGAGT^ATAGGAACGGCTCAACAAGC 
TCGGTGAAAACCCCTATGGAAGATCTTGAGGTGACTCTAATCGAAACTCATGCTAACATC 
AGAATCTTGTCGAGAAGAAGAGGTTTCCGGTGGAGCACGTTGGCCACCACCAAACCGCCG 
CAGCTTTCGAAGCTGGTGGCTTCTCTACAATCGCTGTCCCTCTCCATTCI^CACCTTAGT 
GTCACAACATTGGACAATTATGCTATTTACTCCATCAGCGCTAAGGTGGAAGAGAGTTGC 
CAGCTAAGTTCAGTAGATGACATTG CAGGAG CAGTTCACCACATGCTAAGT ATCATTGAA 
GAGGAGCCTTTTTGTTGCTCATCAATGTCAGAATTACCATTTGACTTCTCTTTGAATCAC 
TCAAATGTCACTCATTCTCTCTGAGAAATCTCTTTTTTGTTGTTGTTATTCCTTCTTTTA 
ATTTTATCACATAGCACAT CTTTAGTTTTTTTTTTT 

>G1225 Amino Acid Sequence (domain in AA coordinates: 78-147) 

MTLEALSSNGLLNFLLSETLSPTPFKSLVDLEPLPENDVIISKNTISEISNQEPPPQRQP 

PATTraGKIORRRRKPRVCKNEEEAENQRMTHI 

DQAS IVGGAIDFIKELEHKLLSLEAQKHHNAJKLNQS VTSSTSQDSNGEQENPHQPS SLSL 

SQFFLHSYDPSQENRNGSTSSVKTPMEDLEWLIETHANIRILSRRRGFRWSTLATTKPP 

QLSKLVASLQSLSLSILHLSOTTLDNYAIYSISAKVEESCQLSSVDDIAGAVHHMLSIIE 

EEPFCCSSMSELPFDFSLNHSNVTHSL* 

>G1337 (97.. 1398) 

AATGGATTTGTCATCATTCTTCTCACCGTCCTTAGTCTCTGAAAATAAATTCTGATTTTG 
ATTTCGAATTTTAGGGATTTTGAGAGAGAGTCAGTTATGAGTAGTTCGGAGAGAGTACCG 
TGCGATTTCTGCGGCGAGCGTACGGCGGTTTTGTTTTGTAGAGCCGATACGGCGAAGCTG 
TGTTTGCCTTGTGATCAGCAAGTTCACACGGCGAATCTGTTGTCGAGGAAGCACGTGCGA 
TCTCAGATCTGCGATAATTGCGGTAACGAGCCAGTCTCTGTTCGGTGTTTCACCGATAAT 
CTGATTTTGTGTCAGGAGTGTGATTGGGATGTTCACGGAAGTTGTTCAGTTTCCGATGCT 
CATGTTCGATCCGCCGTGGAAGGTTTTTCCGGTTGTCCATCGGCGTTGGAGCTTGCTGCT 
TTATGGGGACTTGATTTGGAGCAAGGGAGGAAAGATGAAGAGAATCAAGTTCCGATGATG 
GCGATGATGATGGATAATTTCGGGATGCAGTTGGATTCTTGGGTTTTGGGATCTAATGAA 
TTGATTGTTCCCAGCGATACGACGTTTAAGAAGCGTGGATCTTGTGGATCTAGTTGTGGG 
AGGTATAAGCAGGTATTGTGTAAGCAGCTTGAGGAGTTGCTTAAGAGTGGTGTTGTCGGT 
GGTGATGGCGATGATGGTGATCGTGACCGTGATTGTGACCGTGAGGGTGCTTGTGATGGA 
GATGGAGATGGAGAAGCAGGAGAGGGGCTTATGGTTCCGGAGATGTCAGAGAGATTGAAA 
TGGTCAAGAGATGTTGAGGAGATCAATGGTGGCGGAGGAGGAGGAGTTAACCAGCAGTGG 
AATGCTACTACTACTAATCCTAGTGGTGGCCAGAGTTCTCAGATATGGGATTTTAACTTG 
GGACAGTCACGGGGftCCTGAGGATACGAGTCGAGTGGAAGCTGCATATGTAGGGAAAGGT 
GCTGCTTCTTCATTCACAATCAACAATTTTGTTGACCATATGAATGAAACTTGTTCCACT 
AATGTGAAAGGTGTCAAAGAGATTAAAAAGGATGACTACAAGCGATCAACTTCAGGCCAG 
GTACAACCAACAAAATCTGAGAGCAACAATCGTCCAATTACCTTTGGCTCTGAGAAAGGT 
TCGAACTCCTCCAGTGACTTGCATTTCACAGAGCATATTGCTGGAACTAGTTGTAAGACC 
ACAAGACTAGTTGCAACTAAGGCTGATCTGGAGCGGCTGGCTCAGAACAGAGGAGATGCA 
ATGCAGCGTTACAAGGAAAAGAGGAAGACACGGAGATATGATAAGACCATAAGGTATGAA 
TCGAGGAAGGCAAGAGCTGACACTAGGTTGCGTGTCAGAGGCAGATTTGTGAAAGCTAGT 
GAAGCTCCTTACCCTTAACCTTAAGTTTTTTCACATAGGCTTCCTTTTAGCTACAAACTT 
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GCCCTTCTTGTTTTATTGCCTTATCTGGCCCTTTTATGTACCTTGGAATCTTATCTAGTT 
TAAAAAAGATTGTAACCTTCTAGAAAACCATATTCTGTTGACAGTATATACATGTCTATC 
CAAGCAAAAA 

>G1337 Amino Acid Sequence (domain in AA coordinates: 9-75) 

MSSSERVPCDFCGERTAVLFCRADTAKLCLPCDQQVHTANLLSRKHVRSQICDNCGNEPV 

SVRCFTDNLILCQECDWDVHGSCSVSDAHVRSAVEGFSGCPSALELAALWGIiDLEQGRKD 

EENQVPMMAMMMDNFGMQLDSWVLGSNELIVPSDTTFKKRGSCGSSCGRYKQVLCKQLEE 

LLKSGVVGGDGDDGDRDRDCDREGACDGDGDGEAGEGLMVPEMSERLKWSRDVEEINGGG 

GGGVNQQWNATTTNPSGGQSSQIWDFNLGQSRGPEDT^ 

HMNETCSTNVKGVKEIKKDDYKRSTSGQVQPTKSESN1TOPITFGSEKGSNSSSDLHFTEH 
I AGTS CKTTRLVATKADLERLAQNRGDAMQRYKEKRKTRRYDKT I RYESRKARADTRLRV 
RGRFVKAS EAP YP * 
>G1759 (110.. 700) 

CGAGAAAAGGAAAAAAAAAAATAGAAAGAGAAAACGCTTAGTATCTCCGGCGACTTGAAC 
CCAAACCTGAGGATCAAATTAGGGGAGW^GCCCTC^ 

AAAACTAGAAATCAAGCGAATTGAGAACAAAAGTAGCCGACAAGTCACCTTCTCCAAACG 
TCGCAACGGTCTCATCGAGAAAGCTCGTCAGCTTTCTGTTCTCTGTGACGCATCCGTCGC 
TCTTCTCGTCGTCTCCGCCTCCGGCAAGCTCTACAGCTTCTCCTCCGGCGATAACCTGGT 
CAAGATCCTTGATCGATATGGGAAACAGCATGCTGATGATCTTAAAGCCTTGGATCATCA 
GTCAAAAGCTCTGAACTATGGTTCACACTATGAGCTACTTGAACTTGTGGATAGCAAGCT 
TGTGGGATCAAATGTCAAAAATGTGAGTATCGATGCTCTTGTTCAACTGGAGGAACACCT 
TGAGACTGCCCTCTCCGTGACTAGAGCCAAGAAGACCGAACTCATGTTGAAGCTTGTTGA 
GAATCTTAAAGAAAAGGAGAAAATGCTGAAAGAAGAGAACCAGGTTTTGGCTAGCCAGAT 
GGAGAATAATCATCATGTGGGAGCAGAAGCTGAGATGGAGATGTCACCTGCTGGACAAAT 
CTCCGACAATCTTCCGGTGACTCTCCCACTAC 

AAATCAAAATCCAAAACATATATAATTATGAAGAAAAAAAAAATAAGATATGTAATT^ 
CCGCTGATAAGGGCGAGCGTTTGTATATCTTAATACTCTCTCTTTGGCGAAGAGACTTTG 
TGTGTGATACTTAAGTAGACGGAACTAAGTCAATACTATCTGTTTTAAGACAAAAGGTTG 
ATGAACTTTGTACCTTATTCGTGTGAGAAAAAAAAAAAAAAAA 

>G1759 Amino Acid Sequence (conserved domain in AA coordinates: 2-57) 

MGRKKLEIKRIENKSSRQVTFSK31RNGLIEKARQLSVLCDASVALLWSASGKLYSFSSG 

DJSTLVlCILDRYGKQHADDLKALDHQSKALNYGSHYELLELvTlSK^ 

EEHLETALSVTRAKKTELMLKLVBNLKEK^ P 

AGQISDNLPVTLPLLN* 

>G1804 (169. .1497) 

TATCTCTCTCTTTCTCAAAACCTTTCAGTCAAAATTCTCCGGCGGCTTTTAAACTATGTG 
AAGGAGGAGAACCTCCATAACAAGAAGCGGATTCTCTCAGTTTTCCGGCGGCGGAGGAAC 
ACAAAGCCACCGGTTTTTAGACACACAGATTTCATTTTCAGTTGTTAAATGGTAACTAGA 
GAAACGAAGTTGACGTCAGAGCGAGAAGTAGAGTCGTCCATGGCGCAAGCGAGACATAAT 
GGAGGAGGTGGTGGTGAGAATCATCCGTTTACTTCTTTGGGAAGACAATCCTCTATCTAC 
TCATTGACCCTTGACGAGTTCCAACATGCTTTATGTGAGAACGGCAAGAACTTTGGGTCC 
ATGAACATGGACGAGTTTCTTGTCTCTATTTGGAACGCAGAGGAGAATAATAACAATCAA 
CAACAAGCAGCAGCAGCTGCAGGTTCACATTCTGTTCCGGCTAATCACAATGGTTTCAAC 
AACAACAATAACAATGGAGGCGAGGGTGGTGTTGGTGTCTTTAGTGGTGGTTCTAGAGGC 
AACGAAGATGCTAACAATAAGAGAGGGATAGCGAACGAGTCTAGTCTTCCTCGACAAGGC 
TCTTTGACACTTCCAGCTCCGCTTTGTAGGAAGACTGTTGATGAGGTTTGGTCTGAGATA 
CATAGAGGTGGTGGTAGCGGTAATGGAGGAGACAGCAATGGACGTAGTAGTAGTAGTAAT 
GGACAGAACAATGCTCAGAACGGCGGTGAGACTGCGGCTAGACAACCGACTTTTGGAGAG 
ATGACACTTGAGGATTTCTTGGTGAAGGCTGGTGTGGTTAGAGAACATCCCACTAATCCT 
AAACCTAATCCAAACCCGAACCAAAACCAAAACCCGTCTAGTGTAATACCCGCAGCTGCA 
CAGCAACAGCTTTATGGTGTGTTTCAAGGAACCGGTGATCCTTCATTCCCGGGTCAAGCT 
ATGGGTGTGGGTGACCCATCAGGTTATGCTAAAAGGACAGGAGGAGGAGGGTATCAGCAG 
GCGCCACCAGTTCAGGCAGGTGTTTGCTATGGAGGTGGCGTTGGGTTTGGAGCGGGTGGA 
CAGCAAATGGGAATGGTTGGACCGTTAAGCCCGGTGTCTTCAGATGGATTAGGACATGGA 
CAAGTGGATAACATAGGAGGTCAGTATGGAGTAGATATGGGAGGGCTAAGGGGAAGGAAA 
AGAGTAGTGGATGGTCCAGTGGAGAAAGTAGTGGAGAGAAGACAGAGGAGGATGATCAAG 
AACCGCGAGTCTGCTGCTAGATCTAGAGCAAGAAAACAAGCATATACAGTGGAATTGGAA 
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GCTGAACTTAACCAGTTGAAAGAAGAGAATGCGCAGCTAAAACATGCATTGGCGGAGTTG 
GAGAGGAAGAGGAAGCAACAGTATTTTGAGAGTTTGAAGTCAAGGGCACAACCGAAATTG 
CCGATU^TCGAACGGGAGATTGCGGACATTGATGAGGAACCCGAGTTGTCCACTCTAAACA 
AACAATAGGAAGATGGAGAAGAAGTCGGAGAC^GAACGAGGGAAAAACTGATGATTTTCT 
ACGTTGTTGTTTTGTCTTTGAGGAATGAGGTTATAGAATCTTTATACTTTGATGTTTTCT 
GTGTTGGTAGGAGGAACACCATCTGATCTGCTTTACTAGTGTTCCCTGTGAACAAAGAAA 
GTGATTCTGTGTTTCAACATCATCAATCTTTGGAAA 

>G1B04 Amino Acid Sequence (domain in AA coordinates: 357-407) 

MVTRETKLTSEREVESSMAQARHNGGGGGENHPFTSLGRQSSIYSLTLDEFQHALCENGK 

NFGSMNMDEFLVSIWNAEENNNNQQQAAAAAGSHSVPANH^ 

GSRGNEDANNKRGIANESSLPRQGSLTLPAPLCRKTVDEVWSEIHRGGGSGNGGDSNGRS 

SSSNGQNNAQNGGETAARQPTFGEMTLEDFLVKAGWREHPTNPKPNPNPNQNQNPSSVI 

PAAAQQQLYGVFQGTGDPSFPGQAMGVGDPSGYAKRTGGGGYQQAPPVQAGVCYGGGVGF 

GAGGQQMGMVGPLS PVSSDGLGHGQVDN IGGQYGVDMGGLRGRKRWDGPVEKWERRQR 

RM I KNRE S AARSRARKQAYTVELEAEIjNQLKEENAQLKHAIjAELERKRKQQYFESLKSRA 

QPKLPKSNGRLRTLMRNPSCPL* 

>G207 (16.. 930) 

aaaagatctgtttcaatggcggatcgtgttaaaggtccatggagtcaagaagaagatgag 
cagctacgaaggatggttgagaaatacggaccgaggaattggtctgcgattagcaaatcg 
attccaggtcgatctggtaaatcgtgtagattacgttggtgtaatcagttatctccggag 
gttgagcatcgtcctttctcgccggaggaagatgagactattgtaaccgcccgtgctcag 
tttggtaacaagtgggcgacgattgctcgtcttcttaacggtcgtacggataacgccgtt 
aaaaatcactggaactctacgcttaagaggaaatgcagcggaggtgtggcggttacgacg 
gtgacggagacggaggaagatcaggatcggccgaagaagaggagatctgttagctttgat 
cctgcttttgctccggtggatactggattgtacatgagtcctgagagtcctaacggaatc 
gatgttagtgattctagcacgattccgtcaccgtcgtctcctgttgctcagctgtttaaa 
ccaatgccgatttccggcggttttacggtggttccgcagccgttaccggttgaaatgtct 
tcgtcttcggaggatccacctacttcgttgagtttgtcactacctggagctgagaacacg 
agttcgagccataacaataacaacaacgcgttgatgtttccgagatttgagagtcagatg 
aagattaatgtagaggagagaggaggaggaggagaaggacgtagaggtgagtttatgacg 
gtggtgcaggagatgataaaagctgaagtgaggagttacatggcggaaatgcagaaaaca 
agtggtggattcgtcgtcggaggtttatacgaatccggcggcaatggtggttttagggat 
tgtggagtaataacacctaaggttgagtagttttggtttagggttaaaacttgaatcgat 
tggggattttcaagagcattcatttttggggtttatggtaaaattaaaaacaaaaacaaa 
atgtacagaggaattaaaatttctatggaataatcttaaatctcaaatatttgttacttg 
ttttggtgattcataaccaaaatcaaa 

>G207 Amino Acid Sequence (domain in AA coordinates: 6-106) 

MADRVKGPWSQEEDEQLRRMVEKYGPRl^SAISKSIPGRSGKSCRLRWCNQLSPEVEHRP 

FSPEEDETIWARAQFGlsrKWATIARLIiNGRTD 

EDQDRPKKRRSVSFDPAFAPVDTGLYMSPESPNGIDVSDSSTIPSPSSPVAQLFKPMPIS 
GGFTWPQPLPVEMSSSSEDPPTSLSLSLPGAENTSSSHNNNNNALMFPRFESQMKINV^ 
ERGGGGEGRRGEFMTWQEMIKAEVRSYMAEMQKTSGGFWGGLYESGGNGGFRDCGVIT 
PKVE* 

>G218 (1..1182) 

ATGGAGGCAGAGATCGTGAGACGATCGGAGGTAACGGGATTAAGAAGGGAGGTGGAAGAA 
TCGTCAATTGGTAGAGGAGATTGCGATGGTGATGGCGGCGATGTGGGAGAAGATGCGGCA 
GGGTTCGTTGGGACGAGCGGGAGAGGAAGAAGAGATCGAGTTAAAGGGCCGTGGTCGAAG 
GAGGAGGATGATGTGTTGAGTGAGCTCGTTAAGAGGTTGGGAGCGAGGAATTGGAGTTTT 
ATCGCTCGGAGTATTCCTGGTCGTTCAGGCAAGTCTTGTCGTCTTCGTTGGTGTAATCAG 
CTCAATCCAAATCTTATACGCAATTCATTTACTGAGGTAGAGGATCAGGCTATCATCGCA 
GCACATGCCATCCACGGAAACAAATGGGCTGTTATCGCGAAGCTCCTCCCCGGAAGAACA 
GATAATGCTATCAAGAACCACTGGAACTCTGCTTTAAGACGTCGATTCATAGACTTTGAA 
AAGGCCAAGAATATAGGAACTGGAAGCTTGGTCGTGGATGATTCTGGATTTGACAGAACG 
ACAACAGTAGCCTCATCAGAAGAAACTTTATCTTCAGGCGGTGGTTGCCATGTAACTACT 
CCAATTGTATCTCCAGAAGGCAAAGAAGCTACCACCTCCATGGAAATGTCTGAAGAACAA 
TGCGTAGAGAAAACAAACGGAGAAGGTATTTCTAGGCAAGATGATAAGGATCCTCCAACG 
CTTTTCCGCCCAGTGCCTCGGCTCAGTTCTTTTAATGCTTGCAATCACATGGAAGGATCA 
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CCCTCTCCACATATACAAGACCAAAATCAGCTCCAATCATCTAAACAAGACGCAGCAATG 

CTAAGATTGCTTGAAGGAGCTTACAGCGAACGGTTTGTGCCTCAAACATGTGGAGGTGGT 

TGTTGCAGCAACAATCCCGATGGCAGTTTTCAGCAAGAATCATTGTTGGGTCCAGAGTTT 

GTGGATTACTTAGACTCACCAACGTTTCCGAGTTCCGAACTAGCTGCTATAGCAACGGAA 

ATAGGCAGCCTCGCTTGGCTGAGAAGCGGTTTAGAGAGTAGCAGCGTGAGGGTGATGGAA 

GACGCAGTTGGTCGGTTAAGGCCTCAAGGCTCCAGGGGTCATCGAGATCATTATCTTGTA 

TCTGAACAGGGGACGAACATAACCAATGTCCTGTCCACATAA 

>G218 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEAEIVRRSEVTGLIUIEVEESSIGRGDCIX3DGGDVGEDAAGFVGTSGRGRRDRVKGPWSK 

EEDDVLSELVTOUjGARNWSFI7^RSIPGRSGKSCRLRWCNQLNPNLIRNSFTEVEDQAIIA 

AHAIHGNKWAVIAKLLPGRTDNAJKNHWNSALRRRFIDFEKAK^ 

TTVASSEETLSSGGGCHVTTPIVSPEGKEATTSM^ 

LFRP VPRLS S FNACNHMEGS PS PHI QDQNQLQS S KQDAAMLRLIiEGAYS ERFVPQTCGGG 
CCSNNPDGSFQQESLLGPEFVDYLDSPTFPSSELAAIATEIGSLAWLRSGLESSSVRVME 
DAVGRLRPQGSRGHRDHYLVSEQGTNITNVLST* 
>G241 (46.. 867) 

GAAAAA(^TTTCAACTTCTTTTATCAGCAATC 

TGCTGTGAGAAGATGGGGTTGAAGAGAGGACCATGGACACCTGAAGAAGATCAAATCTTG 

GTCTCTTTTATCCTCAACCATGGACATAGTAACTGGCGAGCCCTCCCTAAGCAAGCTGGT 

CTTTTGAGATGTGGAAAAAGCTGTAGACTTAGGTGGATGAACTATTTAAAGCCTGATATT 

AAACGTGGCAATTTCACCAAAGAAGAGGAAGATGCTATCATCAGCTTACACCA 

GGCAATAGATGGTCAGCGATTGCAGCAAAACTGCCTGGAAGAACCGATAACGAGATCAAG 

AACGTATGGCACACTCACTTGAAGAAGAGACTCGAAGATTATCAACCAGCTAAACCTAAG 

ACCAGCAACAAAAAGAAGGGTACTAAACCAAAATCTGAATCCGTAATAACGAGCTCGAAC 

AGTACTAGAAGCGAATCGGAGCTAGCAGATTCATCAAACCCTTCTGGAGAAAGCTTATTT 

TCGACATCGCCTTCGACAAGTGAGGTTTCTTCGATGACACTCATAAGCCACGACGGCTAT 

AGCAACGAGATTAATATGGATAACAAACCGGGAGATATCAGTACTATCGATCAAGAATGT 

GTTTCTTTCGAAACTTTTGGTGCGGATATCGATGAAAGCTTCTGGAAAGAGACACTGTAT 

AGCCAAGATGAACACAACTACGTATCGAATGACCTAGAAGTCGCTGGTTTAGTTGAGATA 

CAACAAGAGTTTCAAAACTTGGGCTCCGCTAATAATGAGATGATTTTTGACAGTGAGATG 

GAACTTCTGGTTCGATGTATTGGCTAGAACCGGCGGGGAACAAGATCTCTTAGCCGGGCT 

CTAGTTAAC^TGTTTGAGGAGTAAAGTGAAATGGTGCAAATTAGTTAAGGCTAAGAAATT 

GAAAAGCTTTTGTTTACCGAGAAAAAAACACACTCTAACTCTTGATGTGATGTAGTTAGT 

GTATTAATTAGAGGCTGCGTTTTCAA 

>G241 Amino Acid Sequence (domain in AA coordinates: 14-114) 
MGRAPCCEKMGLKRGPWTPEEDQILVS FILNHGHSNWRALPKQAGLLRCGKS CRLRWMNY 
LKPD I KRGNFTKEEEDAI I SLHQI LGNRWSAI AAKLPGRTDNE I KNVWHTHLKKRLEDYQ 
PAKPKTSNKKKGTKPKSESVITSSNSTRSESEIiADSSNPSGESLFSTSPSTSEVSSMTLI 
SHDGYSNE INMDNKPGD I STIDQECVS FETFGAD IDE S FWKETL YS QDEHNYVSNDLEVA 
GLVEIQQEFQNLGSANNEMIFDSEMELLVRCIG* 
>G254 (15.. 923) 

CGATTTCGAGCTCTATGGTGTCCGTAAACCCTAGACCTAAGGGTTTTCCAGTTTTCGATT 
CCTCGAATATGAGTTTACCAAGCTCCGATGGATTTGGTTCGATTCCGGCCACGGGACGGA 
CCAGTACGGTGTCGTTTTCTGAGGATCCGACGACGAAGATTCGGAAGCCGTACACAATCA 
AGAAGTCGAGAGAGAATTGGACAGATCAAGAGCACGATAAATTTCTAGAAGCTCTTCACT 
TATTCGATAGGGATTGGAAGAAAATAGAAGCCTTTGTTGGATCAAAAACAGTAGTTCAGA 
TACGAAGCCACGCTCAGAAATACTTTCTCAAAGTTCAGAAGAGTGGTGCTAACGAACATC 
TTCCACTTCCTCGACCTAAGAGGAAAGCGAGTCATCCTTATCCTATAAAGGCTCCTAAAA 
ATGTTGCTTATACCTCTCTCCCGTCTTCGAGTACATTACCGTTGCTTGAGCCTGGTTATT 
TGTATAGCTCTGATTCGAAGTCATTGATGGGAAACCAGGCTGTTTGTGCATCTACCTCTT 
CTTCGTGGAATCATGAATCGACAAATCTGCCAAAACCGGTGATTGAAGAGGAACCGGGAG 

tctcggccacggctcctctcccaaataatcgctgcagacaggaagatacagagagggtac 
gagcagtgacaaagccaaataacgaagaaagttgtgaaaagccacatagagtgatgccga 
attttgctgaagtttacagcttcattggaagtgtcttcgatcccaacacatcaggccacc 
tc<^gagatt;u^gcagatggatccaataaatatggaaacggttcttttactgatgcaaa 
acctgtctgtaaatctgacaagtcccgagtttgcagagcaaaggaggttgatatcatcat 
acagcgctaaagctttgaaatagagatagaataaaacaataatgtaccttatgtgagatc 
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AAGAGACAATCATCCAAGGTCTGTATGC^TTGCTTGGATTTAGGCCTCGTGTTCTCACTA • 

CAGGAGCAGAACCAATCGCAAAGACTCTTAGATGGCTACTGAGTTGTGGTTTTTATGTCT 

CTGTAAGTCGCGGTGGAGCACACGTGTTTGTCCTGTCTTGTGTATGTGTGTATAGATAAT 

ACAAGGTTTTGCAGAGTAAGGTCACAGTTAGCTGCAAGTGAGTTTGGATCAATCTTAAGA 

TTAAAACCCTGAGAGTGAGTGTCCAAAGAGACTGTGTAATATTGGTTTGGCGGTCAGCAG 

AAGAGTTTTGAAGTGCACATCCAGTTAGTGATAACACGGTTGAAGAAAAGGTAAGGTTAC 

AAGTTTAGTTTTGAATAATTGTATACTCAAAAAATATGAATGTATAAAGAATAATCACTT 

GAGTCGCCTTA 

>G254 Amino Acid Sequence (domain in AA coordinates: 62-106) 

MVSWPRPKGFPVFDSSNMSLPSSDGFGSIPATGRTSTVSFSEDPTTKIRKPYTIKKSRE 

NWTDQEHDKFLEALHLFDRDWKKIEAFVGSKTWQIRSHAQKYFLKVQKSGANEHL 

PKRKASHPYPIKAPKNVAYTSLPSSSTLPLLEPGYLYSSDSKSLMGNQAVCASTSSSWNH 

ESTNLPKPVIEEEPGVSATAPLPNNRCRQEDTERVRAVTKPNNEESCEKPHRVMPNFAEV 

YSFIGSVFDPNTSGHLQRLKQMDPII^ETVLIjLMQNLSVI^TSPEFAEQRRLISSYSAKA 

LK* 

>G26 (73.. 729) 

TTGGCTTGTACCCAAACCCATCTTTGACT^ 

CATCATCGGATAATGCATAGCGGGAAGAGACCTCTATCACCAGAATCAATGGCCGGAAAT 
AGAGAAGAGAAAAAAGAGTTGTGTTGTTGCTCAACTTTGTCGGAATCTGATGTGTCTGAT 
TTTGTCTCTGAACTCACTGGTCAACCCATCCCATCATCCATTGATGATCAATCTTCGTCG 
CTTACTCTTCAAGAAAAAAGTAACTCGAGGCAACGAAACTACAGAGGCGTGAGGCAAAGA 
CCGTGGGGAAAATGGGCGGCTGAGATTCGTGACCCGAACAAGGCAGCTCGTGTGTGGCTT 
GGGACGTTCGACACTGCAGAAGAAGCCGCCTTAGCGTATGATAAAGCTGCATTTGAGTTT 
AGAGGTCACAAGGCCAAGCTTAACTTCCCCGAGCATATTCGTGTCAACCCTACTCAACTC 
TATCCATCGCCCGCTACTTCCCATGATCGCATTATCGTGACACCACCTAGTCCACCTCCA 
CCAATTGCTCCTGACATACTTCTTGATCAATATGGCCACTTTCAATCTCGAAGTAGTGAT 
TCCAGTGCCAACTTGTCC^TGAATATGCTGTCTTCTTCGTCTTCATCTTTGAATCATCAA 
GGGCTAAGACCAAATTTGGAGGATGGTGAAAACGTGAAGAACATTAGTATCCACAAACGA 
CGAAAATAACATGTTAATGGCATAAATATCTCTTCGTCCAAGTTATCAAACGCATTGACC 
TCCGGCTTTGATCATTTTAGGCGCTTAATCTCTTTACGACTTCATTTTGGTAGTCTTTAA 
AGAGTCTATGGAGTGGATTTAGCTAGGAATCAGGCCTTATGGATGAAAAATATATAAATT 
TTGAACATGACTATGCAAGAATGGGATGAAGACTACTTAGCTTGGAAAACGTCCTGATAG 
GTCATGACGACTATATCCACAGAAGATGACCGACGGAGACAACAACATGCCTCACCTGAT 
CGACCGATCAAATGAGATAATGTGTTGACCGGACCGGTCGGATCAGGTTGGGTCGAGTAT 

ATCA 

>G26 Amino Acid Sequence (domain in AA coordinates: 67-134) 
MHSGKRPLSPESimGNREEKKELCCCSTLSESDVSDFVSEIiTGQPIPSSIDDQSSSIiTLQ 

EKSNSRQRITCRGVRQRPWGKWAAEIRDPNKAARWL^ 

AKLNFPEHIRVNPTQLYPSPATSHDRIIVTPPSPPPPIAPDILLDQYGHFQSRSSDSSAN 

LSMNMLSSSSSSLITOQGIjRPNIjEDGENVKNISIHKRRK* 

>G263 (48.. 902) 

TTTTTAGTTTTATTTTTCTGTGGTAAAATAAAAAAAGTTCGCCGGAGATGACGGCTGTGA 
CGGCGGCGCAAAGATCAGTTCCGGCGCCGTTTTTAAGCAAAACGTATCAGCTAGTTGATG 
ATCATAGCACAGACGACGTCGTTTCATGGAACGAAGAAGGAACAGCTTTTGTCGTGTGGA 
AAACAGCAGAGTTTGCTAAAGATCTTCTTCCT 

GCTTCATTCGTCAGCTCAACACTTACGGATTTCGTAAAACTGTACCGGATAAATGGGAAT 
TTGCAAACGATTATTTCCGGAGAGGCGGGGAGGATCTGTTGACGGACATACGACGGCGTA 
AATCGGTGATTGCTTCAACGGCGGGGAAATGTGTTGTTGTTGGTTCGCCTTCTGAGTCTA 
ATTCTGGTGGTGGTGATGATCACGGTTCAAGCTCCACGTCATCACCCGGTTCGTCGAAGA 
ATCCTGGTTCGGTGGAGAACATGGTTGCTGATTTATCAGGAGAGAACGAGAAGCTTAAAC 
GTGAAAACAATAACTTGAGCTCGGAGCTCGCGGCGGCGAAGAAGCAGCGCGATGAGCTAG 
TGACGTTCTTGACGGGTCATCTGAAAGTAAGACCGGAACAAATCGATAAAATGATCAAAG 
GAGGGAAATTTAAACCGGTGGAGTCTGACGAAGAGAGTGAGTGCGAAGGTTGCGACGGCG 
GCGGAGGAGCAGAGGAGGGGGTAGGTGT^AGGATTGAAATTGTTTGGGGTGTGGTTGAAAG 
GAGAGAGAAAAAAGAGGGACCGGGATGAAAAGAATTATGTGGTGAGTGGGTCCCGTATGA 
CGGAAATAAAGAACGTGGACTTTCACGCGCCGTTGTGGAAAAGCAGCAAAGTCTGCAACT 
AAAAAAAGAGTAGAAGACTGTTCAAACCAGCGTGTGACACGTCATCGACGACGACGAAAA 



193 



WO 03/013227 PCT/US02/25805 

194/286 



AAATGATTTAAAAAACTATTTCT 

AGGTGAAGAAGGTCCAGAAGGATCT^ACGCAAATATATAAATGGATTTTCATGTATTATAT 
AATTTAATTAGTGTATTAAGAAAA 

>G263 Amino Acid Sequence (domain in AA coordinates: TBD) 
MTAVTAAQRSVPAPFLSKTYQLVDDHSTDDWSWNEEGTAF\AmKTAEFAKDLLPQYFKH 
NNFSSFIRQLNTYGFRKTVPDKWEFANDYFRRGGEDLLTDIRRRKSVIASTAGKCVWGS 
PSESNSGGGDDHGSSSTSSPGSSKNPGSVENMVADLSGEiraKL 

RDELVTFLTGHLKVRPEQIDKMIKGGKFKPVESDEESECEGCDGGGGAEEGVGEGLKLFG 

VWLKGERKKRDRDEKNYWSGSRMTEIKNVDFHAPLWKSSKVCN* 

>G308 (196.. 1794) 

AGTAATTTAGTTTTTTTTTTTTTTTTTTACAATTTATTTTO 

AGTGAAAAAACAAATCCTAAGCAGTCCTAACCGATCCCCGAAGCTAAAGATTCTTCACCT 

TCCCAAATAAAGCAAAACCTAGATCCGACATTGAAGGAAAAACCTTTTAGATCCATCTCT 

GAAAAAAACCCAACCATGAAGAGAGATCATC^TCATCATCATCAAGATAAGAAGACTATG 

ATGATGAATGAAGAAGACGACGGTAACGGCATGGATGAGCTTCTAGCTGTTCTTGGTTAC 

AAGGTTAGGTCATCGGAAATGGCTGATGTTGCTCAGAAACTCGAGCAGCTTGAAGTTATG 

ATGTCTAATGTTCAAGAAGACGATCTTTCTCAACTCGCTACTGAGACTGTTCACTATAAT 

CCGGCGGAGCTTTACACGTGGCTTGATTCTATGCTCACCGACCTTAATCCTCCGTCGTCT 

AACGCCGAGTACGATCTTAAAGCTATTCCCGGTGACGCGATTCTCAATCAGTTCGCTATC 

GATTCGGCTTCTTCGTCTAACCAAGGCGGCGGAGGAGATACGTATACTACAAACAAGCGG 

TTGAAATGCTCAAACGGCGTCGTGGAAACCACCACAGCGACGGCTGAGTCAACTCGGCAT 

GTTGTCCTGGTTGACTCGCAGGAGAACGGTGTGCGTCTCGTTCACGCGCTTTTGGCTTGC 

GCTGAAGCTGTTCAGAAGGAGAATCTGACTGTGGCGGAAGCTCTGGTGAAGCAAATCGGA 

TTCTTAGCTGTTTCTCAAATCGGAGCTATGAGACAAGTCGCTACTTACTTCGCCGAAGCT 

CTCGCGCGGCGGATTTACCGTCTCTCTCCGTCGCAGAGTCCAATCGACCACTCTCTCTCC 

GATACTCTTCAGATGCACTTCTACGAGACTTGTCCTTATCTCAAGTTCGCTCACTTCACG 

GCGAATCAAGCGATTCTCGAAGCTTTTCAAGGGAAGAAAAGAGTTCATGTCATTGATTTC 

TCTATGAGTC^^GGTCTTCAATGGCCGGCGCTTATGCAGGCTCTTGCGCTTCGACCTGGT 

GGTCCTCCTGTTTTCCGGTTAACCGGAATTGGTC(^CCGGCACCGGATAATTTCGATTAT 

CTTCATGAAGTTGGGTGTAAGCTGGCTCATTTAGCTGAGGCGATTCACGTTGAGTTTGAG 

TACAGAGGATTTGTGGCTAACACTTTAGCTGATCTTGATGCTTCGATGCTTGAGCTTAGA 

CCAAGTGAGATTGAATCTGTTGCGGTTAACTCTGTTTTCGAGCTTCACAAGCTCTTGGGA 

CGACCTGGTGCGATCGATAAGGTTCTTGGTGTGGTGAATCAGATTAAACCGGAGATTTTC 

ACTGTGGTTGAGCAGGAATCGAACCATAATAGTCCGATTTTCTTAGATCGGTTTACTGAG 

TCGTTGCATTATTACTCGACGTTGTTTGACTCGTTGGAAGGTGTACCGAGTGGTCAAGAC 

AAGGTCATGTCGGAGGTTTACTTGGGTAAACAGATCTGCAACGTTGTGGCTTGTGATGGA 

CCTGACCGAGTTGAGCGTCATGAAACGTTGAGTCAGTGGAGGAACCGGTTCGGGTCTGCT 

GGGTTTGCGGCTGCACATATTGGTTCGAATGCGTTTAAGCAAGCGAGTATGCTTTTGGCT 

CTGTTCAACGGCGGTGAGGGTTATCGGGTGGAGGAGAGTGACGGCTGTCTCATGTTGGGT 

TGGCACACACGACCGCTCATAGCCACCTCGGCTTGGAAACTCTCCACCAATTAGATGGTG 

GCTCAATGAATTGATCTGTTGAACCGGTTATGATGATAGATTTCCGACCGAAGCCAAACT 

AAATCCTACTGTTTTTCCCTTTGTCACTTGTTAAGATCTTATCTTTCATTATATTAGGTA 

ATTGAAAAATTTTAATCTCGCCTAAATTACT 

>G308 Amino Acid Sequence (domain in AA coordinates: 270-274) 
MKRDHHHHHQDKKTMMMNEEDDGNGMDELLAVLGYKVRS S EMADVAQKLEQLEVMMSNVQ 
EDDLSQLATETVHYNPAELYTWLDSMLTDLNPPSSNAEYDLKAIPGDAILNQFAIDSASS 
SNQGGGGDTYTTNKRLKCSNGVVETTTATAESTRHV^ 

KENLTVAEALVKQ IGFLAVS QI GAMRQVATYFAET^LARRI YRLS PSQS P IDHSLSDTLQM 
HFYETCPYLKFAHFTJ^QAILEAFQGKKRVHVIDFSMSQGLQWPALMQALAIiRPGGPPVF 
RLTGIGPPAPDNFDYIiHEVGCKLAHLAEAIHVEFEYRG 

SVAVNSVFELHKIjLGRPGAIDKVLGVWQIKPE IFTVVEQESNHNSP I FLDRFTESLHYY 
STLFDSLEGVPSGQDKVMSEVYLGKQICNWACDGPDRVERHETLSQWRITOFGSAGFA^ 
HIGSNAFKQASMLLALFNGGEGYRVEESDGCLMLGWHTRPLIATSAWKLSTN* 
>G38 (149.. 1156) 

GAGGAAAACTCGAAAAAGCTACACACAAGAAGAAGAAGAAAAGATACGAGCAAGAAGACT 
AAACACGAAAGCGATTTATCT^CTCGAAGGAAGAGACTTTGATTTTCAAATTTCGTCCCC 
TATAGATTGTGTTGTTTCTGGGAAGGAGATGGCAGTTTATGATCAGAGTGGAGATAGAT^A 
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CAGAACACAAATTGATACATCGAGGAAAAGGAAATCTAGAAGTAGAGGTGACGGTACTAC 

TGTGGCTGAGAGATTAAAGAGATGGAAAGAGTATAACGAGACCGTAGAAGAAGTTTCTAC 

CAAGAAGAGGAAAGTACCTGCGAAAGGGTCGAAGAAGGGTTGTATGAAAGGTAAAGGAGG 

ACCAGAGAATAGCCGATGTAGTTTCAGAGGAGTTAGGCAAAGGATTTGGGGTAAATGGGT 

TGCTGAGATCAGAGAGCCTAATCGAGGTAGCAGGCTTTGGCTTGGTACTTTCCCTACTGC 

TCAAGAAGCTGCTTCTGCTTATGATGAGGCTGCTAAAGCTATGTATGGTCCTTTGGCTCG 

TCTTAATTTCCCTCGGTCTGATGCGTCTGAGGTTACGAGTACCTCAAGTCAGTCTGAGGT 

GTGTACTGTTGAGACTCCTGGTTGTGTTCATGTGAAAACAGAGGATCCAGATTGTGAATC 

TAAACCCTTCTCCGGTGGAGTGGAGCCGATGTATTGTCTGGAGAATGGTGCGGAAGAGAT 

GAAGAGAGGTGTTAAAGCGGATAAGCATTGGCTGAGCGAGTTTGAACATAACTATTGGAG 

TGATATTCTGAAAGAGAAAGAGAAACAGAAGGAGCAAGGGATTGTAGAAACCTGTCAGCA 

ACAACAGCAGGATTCGCTATCTGTTGCAGACTATGGTTGGCCCAATGATGTGGATCAGAG 

TCACTTGGATTCTTCAGACATGTTTGATGTCGATGAGCTTCTACGTGACCTAAATGGCGA 

CGATGTGTTTGCAGGCTTAAATCAGGACCGGTACCCGGGGAACAGTGTTGCCAACGGTTC 

ATACAGGCCCGAGAGTCAACAAAGTGGTTTTGATCCGCTACAAAGCCTCAACTACGGAAT 

ACCTCCGTTTCAGCTCGAGGGAAAGGATGGTAATGGATTCTTCGACGACTTGAGTTACTT 

GGATCTGGAGT^OTAAACAAAACAATATGAAGCTTTTTGGATTTGATATTTGCCT 

CCAC^CGACTGTTGATTCTCTATCCGAGTTTTAGTGATATAGAGAACTACAGAACACGT 

TTTTTCTTGTTATAAAGGTGAACTGTATATATCGAAACAGTGATATGACAATAGAGAAGA 

CAACTATAGTTTGTTAGTCTGCTTCTCTTAAGTTGTOCTTTAGATATGTTTTATGTTTTG 

TAACAAC^GGAATGAATAATACACACTTGTGAAGCTTTTAAAAAAAAAAAAAAAAAAAAA 

>G38 Amino Acid Sequence (domain in AA coordinates: 76-143) 

MAVYDQSGDRNRTQIDTSRKRKSRSRGDGTTVAERL 

S KKGCMKGKGGPENSRCS FRGVRQRIWGKWVAE IREPNRGSRLWLGTFPTAQEAAS AYDE 
AAKAMYGPIiARLNFPRSDASEVTSTSSQSEVCTVETPGCVHVKTEDPDCESKPFSGGVEP 
I^CLENGAEEMKRGVKADKHWLSEFEHNYWSDILKEKEKQK^ 

DYGWPNDVDQSHLDSSDMFDVDELLRDLNGDDVFAGLNQDRYPGNSVANGSYRPESQQSG 

FDPLQSLNYGIPPFQLEGKDGNGFFDDLSYLDLEN* 

>G43 (38.. 643) 

CTCCTGTCTTGTCTAAAGAAAAAAGAGAGAGGAAGAAATGGAGACTTTTGAGGAAAGCTC 

TGATTTGGATGTTATACAGAAACATCTATTTGAAGACTTGATGATCCCTGATGGTTTCAT 

TGAAGATTTTGTCTTTGATGATACTGCTTTTGTCTCCGGACTCTGGTCTCTAGAACCCTT 

TAACCCAGTTCCGAAACTGGAACCTAGTTCACCTGTTCTTGATCCAGATTCCTATGTCCA 

AGAGATTCTGCAAATGGAAGCAGAATC^TCATCATCATCATC^C^CAACGTCACCTGA 

GGTTGAGACTGTCTCAAACCGGAAAAAAACAAAGAGGTTTGAAGAAACGAGACATTA<^ 

AGGCGTGAGAAGGAGGCCATGGGGGAAATTTGCAGCAGAGATTCGAGATCCGGCAAAGAA 

AGGATCCAGGATTTGGTTAGGCACTTTTGAGAGTGATATTGATGCTGCAAGGGCTTACGA 

CTATGCAGCTTTTAAGCTCAGGGGAAGAAAAGCTGTTCTCAACTTTCCTTTGGATGCCGG 

AAAGTATGATGCTCCGGTCAATTCATGCCGAAAAAGGAGGAGAACCGATGTACCACAGCC 

TCAAGGAACAACAACAAGTACTTCATCATCGTCATCAAACTAATGGGGGAATAGTGATGT 

TTAATTAGTATATATAGGTTAATATCTTAAGTATGTGAAGCATCATGTATAGAGCCAAGA 

ACCTGTTAGACTAGTGTACTGAAAAGAACTCTTGCAAAATATGTACTAAAGAGTTCCTGT 

AAC7UITGGAACTTCTGCGTTTTCTCTTGTCTTAAAGAGCTTAAGGTTCTAGAAACAAAGT 

TCTTGTCCTTTCGGTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

AAAAAAAAA * 

>G43 Amino Acid Sequence (domain in AA coordinates: 104-172) 

METFEESSDLDVIQKHLFEDLMIPDGFIEDFVFDDTAFVSGLWSLEPFNPVPKLEPSSPV 

LDPDSWQEILQMEAESSSSSSTTTSPEVETVSNRKKTKRFEETRHYRGVRRRPW 

EIRDPAKKGSRIWLGTFESDIDAARAYDYAAFKLRGRKAVLNFPLDAGKYDAPVNSCRKR 

RRTDVPQPQGTTTSTSSSSSN* 

>G536 (1..768) 

ATGTCGACAAGGGAAGAGAATGTTTACATGGCGAAATTAGCCGAACAAGCTGAACGTTAC 
GAAGAAATGGTTGAATTCATGGAGAAAGTTGCGAAAACTGTTGATGTTGAGGAACTTTCA 
GTTGAAGAGAGGAATCTTCTCTCTGTTGCTTACAAGAACGTGATTGGAGCGAGAAGAGCT 
TCGTGGAGAATCATTTCTTCGATTGAGCAGAAAGAAGAGAGCAAAGGGAACGAAGATCAT 
GTTGCTATTATCAAGGATTACAGAGGAGAGATTGAATCCGAGCTTAGCAAAATCTGTGAT 
GGGATTTTGAATGTTCTTGAAGCTCATCTTATTCCTTCTGCTTCACCAGCTGAATCTAAA 
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GTGTTTTATCTTAAGATGAAGGGTGATTATCATAGGTATCTTGCTGAGTTTAAGGCTGGT 

GCTGAAAGGAAAGAAGCTGCTGAAAGCACTTTGGTTGCTTACAAGTCTGCTTCCGACATT 

GCCACTGCTGAGTTAGCTCCTACTCACCCGATAAGGCTTGGTCTTGCACTCAACTTCTCT 

GTGTTTTACTATGAAATCCTCAACTCGCCTGATCGTGCTTGCAGCCTCGCAAAGCAGGCG 

TTTGATGATGCAATCGCTGAGTTAGATACATTGGGTGAGGAATCATACAAGGACAGTACA 

CTGATTATGCAGCTTCTTAGAGACAATCTCACTCTCTGGACTTCAGATATGACTGACGAA 

GCAGGAGATGAGATTAAGGAGGCATCAAAGCCCGATGGTGCCGAGTAA 

>G536 Amino Acid Sequence (domain in AA coordinates : 226-233) 

MSTREENVYMAKLAEQAERYEEMVEFM 

SWRIISSIEQKEESKGNEDHVAIIKDYRGEIESELSKICDGILNVLEAHLIPSASPAESK 

VFYLKMKGDYHRYLAEFKAGAERKEAAESTLVAYKSM 

VFYYEILNSPDRACSLAKQAFDDAIAELDT^^ 

AGDE I KEAS KPDG AE * 

>G567 (38.. 1273) 

AAAAAGAAGAATCAGAAAGTGAAAAAGAGAGCGAGCGATGAACAGTATCTTCTCCATTGA 
CGATTTCTCCGATCCTTTCTGGGAAACTCCTCCGATTCCTCTCAATCCCGACTCTTCTAA 
GCCTGTTACGGCGGATGAAGTTAGCCAGAGTCAACCGGAATGGACTTTCGAGATGTTTCT 
CGAAGAGATTT'CTTCGTCGGCGGTGAGCTCTGAGCCACTTGGTAACAACAACAACGCGAT 
CGTCGGTGTTTCTTCGGCGCAATCTCTTCCTTCTGTTTCCGGACAGAATGATTTCGAGGA 
TGATAGTCGATTTCGTGATCGCGATTCGGGAAATTTGGATTGTGCTGCTCCCATGACGAC 
GAAGACGGTGAATGTTGATTCCGATGATTATCGTCGTGTTCTTAAGAACAAGCTTGAGGC 
TGAGTGCGCGACTGGTGTTTCTCTTCGGGTTGGGTCTGTGAAGCCTGAAGATTCGACTAG 
TTCTCCAGAAACTCAACTTCAACCAGTTCAATCCAGTCCTCTTACTCAAGGAGAACTTGG 
TGTTACTTCTTCCTTACCAGCTGAGGTGAAAAAAACTGGTGTATCAATGAAGCAGGTTAC 
TAGTGGATCGTCGAGAGAATATTCTGATGACGAGGACCTTGATGAAGAGAATGAAACCAC 
CGGTTCCTTGAAGCCAGAGGACGTTAAAAAATCTAGAAGGATGCTGTCAAATCGTGAGTC 
AGCTAGGCGATCTAGAAGGAGAAAGCAGGAGCAAACAAGTGACCTCGAAACAOVGGTTAA 
TGATCTAAAAGGTGAGCATTCATCACTTCTTAAACAACTGAGCAACATGAATCACAAGTA 
TGACGAGGCTGCTGTTGGCAATAGAATACTAAAGGCTGACATTGAGACATTAAGAGCTAA 
GGTGAAAATGGCGGAAGAAACCGTGAAGAGAGTAACAGGAATGAATCCGATGCTTCTCGG 
AAGATCAAGTGGACATAACAACAACAACAGAATGCCAATAACTGGTAACAACAGGATGGA 
TTCTTCTAGCATTATTCCAGCTTATCAACCACACTCAAACCTAAACCATATGTCAAACCA 
AAACATCGGGATCCCAACCATTCTACCTCCAAGACTCGGAAACAATTTCGCTGCTCCTCC 
ATCCCAAA.CCAGCTCTCCCTTGCAGAGAATTAGAAATGGGCAAAATCACCATGTTACTCC 
AAGCGCCAACCCGTATGGCTGGAATACCGAACCTCAGAACGATTCAGCATGGCCGAAAAA 
ATGCGTGGACTGATCAAACAAGAAGCGGGTTTCGCACTATATTAATGTCTATGCATCTGT 
AATTTGTAAGTGTTATTAAGTTACGAATCATGAGAAAACATCTTGTGAAAATACAGTCTC 
ATGGCTTATATATATATATAAGCTCTGTCTTATAACATTACAAGATTCTTATTTGAGAAT 
CGTCTTTCTATTTATAGCTAATAAAAAAAAAAAAAAAAA 

>G567 Amino Acid Sequence (domain in AA cordinates 210-270) 

MNSIFSIDDFSDPFWETPPIPLNPDSSKPVTADEVSQSQPEWTFEMFLEEISSSAVSSEP 

LGNNimAIVGVSSAQSLPSVSGQNDFEDDSRFR^ 

VLKNKLEAECATGVSLRVGSVKPEDSTSSPETQLQPVQSSPLTQGELGVTSSLPAEVKKT 
GVSMKQVTSGSSREYSDDEDLDEENETTGSLKPEDVKKSRRMLSNRESARRSRRRKQEQT 
SDLETQVNDLKGEHS S LLKQLSNMNHKYDEAAVGNRILKAD IETLRAKVKMAEETVKRVT 
Gr^PMLLGRSSGHNNNNRMPITGNira^ 

GNNFAAPPSQTSSPLQRIRNGQNHHVTPSANPYGWNTEPQNDSAWPKKCVD* 

>G680 (338..22T5) 

CAGTTATCTTCTTCCTTCTTCTCTCTC 

TTTTGCTTCCGATTTGATTATTTCCGGGAACGATGACTTCTCCGGGGAGTTCCCGGTGAG 
ATGATAAGTCAGATTGCATACTTGTCTCCTCCATGGCTACTCTCAAGGGTTTTGGCTGCG 
GTGGATTCGTTTGGTTTCTCTAGAATCTAAAGAGGTTATCACAACGGCTTTGCAATTTGA 
AAACTTTCATGTTTGGGGAGATCAAAGATGGTTTCTTTTTTATACTTTACTTGTTAGAGA 
GGATTTGAAGCAGCGAATAGCTGCAACCGGTCCTGTTATGGATACTAATACATCTGGAGA 
AGAATTATTAGCTAAGGCAAGAAAGCCATATACAATAACAAAGCAGCGAGAGCGATGGAC 
TGAGGATGAGCATGAGAGGTTTCTAGAAGCCTTGAGGCTTTATGGAAGAGCTTGGCAACG 
AATTGAAGAACATATTGGGACT^AAGACTGCTGTTCAGATCAGAAGTCATGCACAAAAGTT 
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CTTCACAAAGTTGGAGAAAGAGGCTGAAGTTAAAGGCATCCCTGTTTGCCAAGCTTTGGA 

CATAGAAATTCCGCCTCCTCGTCCTAAACGAAAACCCAATACTCCTTATCCTCGAAAACC 

TGGGAACAACGGTACATCTTCCTCTCAAGTATCATCAGCAAAAGATGCAAAACTTGTTTC 

ATCGGCCTCTTCTTCACAGTTGAATCAGGCGTTCTTGGATTTGGAAAAAATGCCGTTCTC 

TGAGAAAACATCAACTGGAAAAGAAAATCAAGATGAGAATTGCTCGGGTGTTTCTACTGT 

GAACAAGTATCCCTTACCAACGAAACAGGTAAGTGGCGACATTGAAACAAGTAAGACCTC 

AACTGTGGACAACGCGGTTCAAGATGTTCCCAAGAAGAACAAAGACAAAGATGGTAACGA 

TGGTACTACTGTGCACAGCATGCT^AACTACCCTTGGCATTTCCACGCAGATATTGTGAA 

CGGGAATATAGCAAAATGCCCTCAAAATCATCCCTCAGGTATGGTATCTCAAGACTTCAT 

GTTTCATCCTATGAGAGAAGAAACTCACGGGCACGCAAATCTTCAAGCTACAACAGCATC 

TGCTACTACTACAGCTTCTCATCAAGCGTTTCCAGCTTGTCATTCACAGGATGATTACCG 

TTCGTTTCTCCAGATATCATCTACTTTCTCCAATCTTATTATGTCAACTCTCCTACAGAA 

TCCTGCAGCTCATGCTGCAGCTACATTCGCTGCTTCGGTCTGGCCTTATGCGAGTGTCGG 

GAATTCTGGTGATTCATCAACCCCAATGAGCTCTXCTCCTCCAAGTATAACTGCCATTGC 

CGCTGCTACAGTAGCTGCTGCAACTGCTTGGTGGGCTTCTCATGGACTTCTTCCTGTATG 

CGCTCCAGCTCCAATAACATGTGTTCCATTCTCAACTGTTGCAGTTCCAACTCCAGCAAT 

GACTGAAATGGATACCGTTGAAAATACTCAACCGTTTGAGAAACAAAACACAGCTCTGCA 

AGATCAAACCTTGGCTTCGAAATCTCCAGCTTCATCATCTGATGATTCAGATGAGACTGG 

AGTAACCAAGCTAAATGCCGACTCAAAAACC^^TGATGATAAAATTGAGGAGGTTGTTGT 

TACTGCCGCTGTGCATGACTCAAACACTGCCCAGAAGAAAAATCTTGTGGACCGCTCATC 

GTGTGGCTCAAATACACCTTCAGGGAGTGACGCAGAAACTGATGCATTAGATAAAATGGA 

GAAAGATAAAGAGGATGTGAAGGAGACAGATGAGAATCAGCCAGATGTTATTGAGTTAAA 

TAACCGTAAGATTAAAATGAGAGACAACAACAGCAACAACAATGCAACTACTGATTCGTG 

GAAGGAAGTCTCCGAAGAGGGTCGTATAGCGTTTCAGGCTCTCTTTGCAAGAGAAAGATT 

GCCTCAAAGCTTTTCGCCTCCTCAAGTGGCAGAGAATGTGAATAGAAAACA7UVGTGACAC 

GTCAATGCCATTGGCTCCTAATTTCAAAAGCCAGGATTCTTGTGCTGCAGACCA^ 

AGTAGTAATGATCGGTGTTGGAACATGCAAGAGTCTTAAAACGAGACAGACAGGATTTAA 

GCCATAC^GAGATGTTC^TGGAAGTGAAAGAGAGCC^GTTGGGAACATAAACAATC^ 

AAGTGATGAAAAAGTCTGCAAAAGGCTTCGATTGGAAGGAGAAGCTTCTACATGACAGAC 

TTGGAGGTAAAAAAAAAAC^TCCACATTTTTATCAATATCTTOAAATCTA 

TTTGCTTCTCCAATCTTTATGAAAGAGA 

CATGTCAGGTTCTGTACCATATTACCCCATGTCTTGTCTCTTGTCTCTGTTTGTGTATGC 
TACTTGTGGTCTATATGTCATCTGCTACTACTGTTAATTAACCATTAAGCAATGGATTTG 
TCTTTA 

>G680 Amino Acid Sequence (domain in AA coordinates: 24-70) 

MDTNTSGEELLAKARKPYTITKQRERWTEDEHERPLEALRLYGRAWQRIEEHIGTKTAVQ 

IRSHAQKFFTKLEKEAEVKGIPVCQALDIEIPPPRPKRKPNTPYPRKPGNNGTSSSQVSS 

AKDAKLVSSASSSQI^QAFLDLEKMPFSEKTSTGKENQDENCSGVSTVNKYPLPTKQVSG 

DIETSKTSTVDNAVQDVPKKNKBra 

GMVSQDFMFHPMREETHGHANLQATTASATTTASHQAFPACHSQDDYRS FLQI S STFSNL 
IMSTLLQNPAAHAAATFAASVWPYASVGNSGDSSTPMSSS PPS ITAIAAATVAAATAWWA 
SHGLLPVCAPAP I TCWFS WAVPTPAMTEI^TVENTQPFEKQNTALQDQTLAS KSPAS S 
SDDSDETGVTKLNADSKTNDDKJEEVVVTAAVHDSNT 
TDALDKMEIODKEDVKETDENQPDVIELN^ 

ALFARERLPQSFSPPQVAENVl^KQSDTSMPIjAPNFKSQDSaiADQEGVVM 
KTRQTGFKPYKRCSMEVKESQVGNINNQSDEKVCKRLRLEGEAST* 
>G867 (64.. 1098) 
CACAACACAAACACATTTCTGTTTTCTC 

TAAATGGAATCGAGTAGCGTTGATGAGAGTACTACAAGTACAGGTTCCATCTGTGAAACC 
CCGGCGATAACTCCGGCGAAAAAGTCGTCGGTAGGTAACTTATACAGGATGGGAAGCGGA 
TCAAGCGTTGTGTTAGATTCAGAGAACGGCGTAGAAGCTGAATCTAGGAAGCTTCCGTCG 
TCAAAATACAAAGGTGTGGTGCCACAACCAAACGGAAGATGGGGAGCTCAGATTTACGAG 
AAACACCAGCGCGTGTGGCTCGGGACATTCAACGAAGAAGACGAAGCCGCTCGTGCCTAC 
GACGTCGCGGTTCACAGGTTCCGTCGCCGTGACGCCGTCACAAATTTCAAAGACGTGAAG 
ATGGACGAAGACGAGGTCGATTTCTTGAATTCTCATTCGAAATCTGAGATCGTTGATATG 
TTGAGGAAACATACTTATAACGAAGAGTTAGAGCAGAGTAAACGGCGTCGTAATGGTAAC 
GGAAACATGACTAGGACGTTGTTAACGTCGGGGTTGAGTAATGATGGTGTTTCTACGACG 
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GGGTTTAGATCGGCGGAGGCACTGTTTGAGAAAGCGGTAACGCCAAGCGACGTTGGGAAG 
CTAAACCGTTTGGTTATACCGAAACATCACGCAGAGAAACATTTTCCGTTACCGTCAAGT 
AACGTTTCCGTGAAAGGAGTGTTGTTGAACTTTGAGGACGTTAACGGGAAAGTGTGGAGG 
TTCCGTTACTCGTATTGGAACAGTAGTCAGAGTTATGTTTTGACTAAAGGTTGGAGCAGG 
TTCGTTAAGGAGAAGAATCTACGTGCTGGTGACGTGGTTAGTTTCAGTAGATCTAACGGT 
CAGGATCAACAGTTGTACATTGGGTGGAAGTCGAGATCCGGGTCAGATTTAGATGCGGGT 
CGGGTTTTGAGATTGTTCGGAGTTAACATTTCACCGGAGAGTTCAAGAAACGACGTCGTA 
GGAAACAAAAGAGTGAACGATACTGAGATGTTATCGTTGGTGTGTAGCAAGAAGCAACGC 
ATCTTTCACGCCTCGTAACAACTCTTCTTCTTTTTTTTT 

TTAAAAACTCCATTTTCGTTTTCTTTATTTGCATCGGTTTCTTTCTTCTTGTTTA 

GGTTCATGAGTTGTTTTTGTTGTATTGATGAACTGTAAATTTTATTTATAGGATAAATTT 

TAAAAAAAAAAAAAAAAAAAA 

>G867 Amino Acid Sequence (domain in AA coordinates: 59-124) 
MESSSVDESTTSTGSICETPAITPAKKSSVG^YI^GSGSSVVLDSENGVEAESRKLPSS 
KYKGWPQPNGRWGAQIYEKHQRVWLGTFNEEDEAARAYDVAVHRFRRRDAVTN 
DEDEVDFI^SHSKSEIVDMLRKHTYNEELEQSKRRRNGNG^^ 

FRSAEALFEKAVTPSDVGKLNRLVIPKHHAEKHFPLPSSNVSVKGVLLNFEDWGKVWRF 
RYSYVWTSSQSYVLTKGWSRFVKEKNLRAGDVVSFSRSNGQDQQLYIGWKSRSGSDLDAGR 
VLRLFGWISPESSRNDWGNKRVNDTEMLSLVCSKKQRIFHAS* 
>G956 (1..840) 

ATGGAGGAGACAGAAAAGAATAAGGGCAGCATAAGTATGGTTGAGGCTAATCTACCTCCT 

GGTTTTAGATTCCATCCTAGAGACGACGAGCTCGTCTGTGACTACTTAATGAGAAGAACC 

GTTCGC^GCCrCTATCAACCAGTTGTCTTGATCGACGTCGATCTTAACAAATGCGAGCCT 

TGGGACATTCCTCAAACGGCGAGAGTGGGAGGGAAAGAATGGTACTTTTACAGCCAAAAA 

GACCGTAAATACGCAACAGGCTACAGAACAAACCGGGCTACGGCCACCGGTTATTGGAAA 

GCCACCGGGAAAGATAGAiSCAATCCAAAGAAACGGTGGTCTTGTGGGTATGAGAAAGACA 

CTTGTGTTTTACCGAGGTCGATCCCCTAAAGGTCGTAAAACTGATTGGGTCATGCATGAG 

TTTCGTCTCCAAGGAAAACTTCTTCACCACTCCCCTAATTCTCTCGAGGAAGAGTGGGTA 

TTGTGTAGAGTTTTCCAC^GAACAGC^CGGAGCTGATATAGACGACATCACAAGGAGC 

TGCTCTGATGCAACAGCTTCTGCATTCATGGACTCTTACATCAACTTCGACCATCATCAC 

ATCATCAATCAGCATGTACCCTGCTTCTCCAATAATTTGTCACATAACCAAACCAACCAA 

TCCGGTTTAATCTCCAAGAACTCCAGCCC^TTGTTTAATGCTTCCCCTGATCAAATGATT 

CTCAGAACTTTGCTAAGTCAACTCACAAAAAAAGTCGAAGAATCACAGAGTCGTGGAGAC 

GGAAGCTCAGAGAGCCAATTGACCGACATTGGCATCCCAAGCCATGCATGGAATTACTGA 

>G956 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEETEKNKGSISMVEANLPPGFRFHPRDDELVCDYL^ 

WD I PQT7\RVGGKEWYF YS QKDRKYATGYRTNRATATGYWKATGKDRAIQRNGGLVGMRKT 
LVFYRGRSPKGRKTDWVMHEFRLQGKLLHHSPNSLE 

CSDATASAFMDSYINFDHHHIINQHVPCFSNNLSHNQTNQSGLI SKNSSPLFNASPDQMI ■ 

LRTLLSQLTKKVEESQSRGDGSSESQIjTDIGIPSHAWNY* 

>G996 (53.. 1063) 

CGATCGATCTTGAATTGATTCTTTGTAGTATTTTATTTACATATATATATAGATGGGAAG 
AC^TTC^TGTTGTTACAAACAGAAACTGAGGAAAGGACTTTGGTCTCCTGAAGAAGATGA 
GAAGCTTCTTCGTTACATCACTAAGTATGGTCATGGTTGCTGGAGCTCTGTCCCTAAACA 
AGCTGGTTTACAGAGATGTGGAAAAAGTTGTAGATTAAGATGGATAAATTATTTAAGACC 
AGATTTGAAGAGAGGAG(^TTTTCT(^GATGAAGAAAATCTCATTATTGAACTTCATGC 
CGTTCTTGGCAATAGATGGTCTCAGATAGCTGCACAGCTTCCTGGAAGAACCGACAATGA 
AATCAAGAATCTTTGGAATTCTTGTTTGAAGAAGAAATTGAGGCTGAGAGGAATTGACCC 
GGTTACACACAAGCTCTTAACCGAAATCGAAACCGGTACAGATGACAAAACAAAACCGGT 
TGAGAAGAGTCAACAGACCTACCTCGTTGAGACTGATGGCTCCTCTAGTACCACTACTTG 
TAGTACTAACCAAAACAACAACACTGATCATCTTTATACCGGAAATTTCGGTTTTCAACG 
GTTAAGTCTAGAAAACGGTTCAAGAATCGCAGCCGGTTCTGACCTCGGTATCTGGATTCC 
CCAAACCGGAAGAAACCATCATCATCATGTCGATGAAACCATCCCTAGTGCAGTGGTACT 
ACCCGGTTCAATGTTCTCATCCGGTTTAACCGGTTATAGATCCTCCAATCTCGGTTTAAT 
TGAATTGGAAAACTCATTCTCAACCGGGCCAATGATGACAGAGCATCAGCAAATTCAAGA 
GAGTAACTACAACAATTCAACATTCTTTGGAAATGGGAATCTC 

GGAGGAAAATCAAAATCCATTCACAATATCGAATCATTCAAATTCGTCCTTATACAGTGA 
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TATAAAATCAGAGACCAATTTTTTTGGCACAGAGGCTACAAATGTTGGTATGTGGCCATG 
TAACCAGCTTCAGCCTCAGCAACATGCATATGGCCATATATAAATCTTCTTGTATATTAT 

AA , 

>G996 Amino Acid Sequence (domain in AA coordinates: 14-114) 

MGRHSCCYKQKLRKGLWSPEEDEKLLRYITKYGHGCWSSVPKQAGLQRCGKSCRLRWINY 

LRPDLKRGAFSQDEENLIIELHAVLGNRWSQIAAQLPGRTDNEIKNLWNSCLKKKLRLRG 

IDPVTHKLLTEIETGTDDKTKPVEKSQQTYLVETDGSSSTTTCSTNQNNNTDHLYTGNFG 

FQRLSLENGSRIAAGSDLGIWIPQTGRNHHHHVDETIPSAVVLPGSMFSSGLTGYRSSNL 

GLIELENSFSTGPMMTEHQQIQESITYNNSTFFGNGNLNWGLTMEENQNPFTISNHSNSSL 

YSDI KSETNFFGTEATNVGMWPCNQLQPQQHAYGHI * 

>G1946 (90.. 1547) 

TCTCACCTATTGTAAAAATCACCAGTTTCGTATATAAAACCCTAATTTTCTCAAAATTCC 
CAAATATTGACTTGGAATCAAAAATCCGAATGGATGTGAGCAAAGTAACCACAAGCGACG 
GCGGAGGAGATTCAATGGAGACTAAGCCATCTCCTCAACCTCAGCCTGCGGCGATTCTAA 
GTTCAAACGCGCCTCCTCCGTTTCTGAGCAAGACCTATGATATGGTTGATGATCACAATA 
CAGATTCGATTGTCTCTTGGAGTGCTAATAACAACAGTTTTATCGTTTGGAAACCACCGG 
AGTTCGCTCGCGATCTTCTTCCTAAGAACTTTAAGCATAATAATTTCTCCAGCTTCGTTA 
GACAGCTTAATACCTATGGTTTCAGGAAGGTTGACCCAGATAGATGGGAATTTGCGAATG 
AAGGTTTTTTAAGAGGTCAGAAGCACTTGCTACAATCAATAACTAGGCGAAAACCTGCCC 
ATGGACAGGGACAGGGACATCAGCGATCTCAGCACTCGAATGGACAGAACTCATCTGTTA 
GCGCATGTGTTGAAGTTGGCA^TTTGGTCTCGAAGAAGAAGTTGAAAGGCTTAAAAGAG 
ATAAGAACGTCCTTATGCAAGAACTCGTCAGATTAAGACAGCAGCAACAGTCCACTGATA 
ACCAACTTCAAACGATGGTTCAGCGTCTCCAGGGCATGGAGAATCGGCAACAACAATTAA 
TGTCATTCCTTGCAAAGGGAGTACAAAGCCCTCA 

AGAATCAGCAAAACGAGAGTAATAGGCGCATCAGTGATACCAGTAAGAAGCGGAGATTCA 

AGCGAGACGGC^TTGTCCGTAATAATGATTCTGCTACTCCTGATGGACAGATAGTGAAGT 

ATCAACCTCCAATGCACGAGCAAGCCAAAGCAATGTTTAAACAGCTTATGAAGATGGAAC 

CTTACAAAACCGGCGATGATGGTTTCCTTCTAGGTAATGGTACGTCTACTACCGAGGGAA 

CAGAGATGGAGACTTCATCAAACCAAGTATCGGGTATAACTCTTAAGGAAATGCCTACAG 

CTTCTGAGATACAGTCATCATCACCAATTGAAACAACTCCTGAAAATGTTTCGGCAGCAT 

C^GAAGCAACCGAGAACTGTATTCCTTCACCTGATGATCTAACTCTTCCCGACTTCACTC 

ATATGCTACCGGAAAATAATTCAGAGAAGCCTCCAGAGAGTTTCATGGAACCAAACCTGG 

GAGGTTCTAGTCCATTACTAGATCCAGATCTGTTGATCGATGATTCTTTGTCCTTCGACA 

TTGACGACTTTCCAATGGATTCTGATATAGACCCTGTTGATTACGGTTTACTCGAACGCT 

TACTCATGTCAAGCCCGGTTCCAGATAATATGGATTCAACACCAGTGGACAATGAAACAG 

AGCAGGAACAAAATGGATGGGACAAAACTAAGCATATGGATAATCTGACTCAACAGATGG 

GTCTCCTCTCTCCTGAAACCTTAGATCTCTCAAGGCAAAATCCTTGATTTTGGGAGTTTT 

TAAAGTCTTTTGAGGTAACACAGTCCCTGAGAGCAGCATATTCAT 

>G1946 Amino Acid Sequence (domain in AA coordinates: 32-130) 

MDVSKVTTSDGGGDSMETKPSPQPQPAAILSSNAPPPFLSKTYDMVDDHNTDSIVSWSAN 

I^SFIWKPPEFARDLLPKIJFKHNNFSSFWQL^^ 

LQS ITRRKPAHGQGQGHQRSQHSNGQNS SVSACVEVGKFGLEEEVERLKRDK1JVLMQEI>V 

RLRQQQQSTDNQLQTMVQRLQGMENRQQQLMSFLAKAVQSPHFLSQFLQQQNQQNESNRR 

ISDTSKJKRRFKRDGIVRNNDSATPDGQIVKYQPPl^EQAKAMFKQLM 

LGNGTSTTEGTEMETSSNQVSGITLKEMPTASEIQSSSPIETTPENVSAASEATENCIPS 

PDDLTLPDFTHMLPENNSEKPPESFMEPNLGGSSPLLDPDLLIDDSLSFDIDDFPMDSDI 

DPVDYGLLERLLMSSPVPDNMDSTPVDNETEQEQNGVroKTKHMDNLTQQMGLLSPETIiDL 

SRQNP* — 
>G217 (84.. 2618) 

cttcgttcttaccga^ttccacgagcattagcttcagagaccttgaattggagtgcggtt 
ggatcaaaaacagttgagcgaagatgaggattatgattaagggaggtgtttggaagaaca 
ccgaagatgagattctcaaagccgccgtgatgaagtatggtaagaaccaatgggctcgga 
tctcgtctcttctcgttcgtaagtctgctaaacagtgtaaagctcgctggtacgagtggc 
tcgatccatctatcaaaaagactgaatggaccagagaagaagatgagaagcttctacatc 
ttgctaaacttctgcctactcaatggagaactattgctcctattgtgggtcgtacaccat 
ctcaatgtcttgagaggtatgagaagctccttgatgcagcatgcactaaggatgaaaatt 
atgatgcagcggatgatccacgaaaattacgtcctggtgagattgatccgaacccagaag 
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caaagcctgctcgtcctgatccggtagacatggacgaagatgagaaagaaatgctttctg 
aagcaagagctagattggctaacacgaggggaaagaaggctaaaagaaaagctagagaaa 
aacaacttgaggaagctagaaggcttgcttctctgcaaaaaagaagagaactaaaagcag 
ctgggattgatggaaggcataggaaaagaaagagaaagggaatcgactataatgcagaaa 
ttccttttgaaaagagggcacctgcgggattttatgatactgcggatgaagatcgtcctg 
ctgatcaagtaaaatttccaactaccattgaagaacttgaaggaaaaagaagagctgatg 
tagaagcacatttacgcaaacaagatgttgcaaggaataaaattgctcagagacaggatg 
ctccagcagctatattgcaagcaaacaagctgaatgatccggaagttgttaggaagaggt 
caaagctgatgttaccaccaccgcagatttcagaccacgagctagaagaaattgctaaga 
tgggctatgccagtgaccttcttgccgagaatgaggagctaacagaaggcagtgctgcta 
ctcgtgcacttttggcaaattactcacaaacaccaaggcaaggaatgacacccatgagga 
cacctcaaagaactcctgctggtaaaggtgatgctattatgatggaagcagaaaacctgg 
ccagattaagagactctcagacacctttgctaggaggagaaaatcctgagttgcaccctt 
ctgacttcactggggtcactccgagaaagaaggagattcaaacgcctaatccaatgttga 
ccccttcaatgactcctggtggtgctggtcttactccaagaattggcttgacgccatcaa 
gggatgggtcttctttttctatgacacccaaagggactcccttcagggatgaacttcaca 
ttaacgaagacatggacatgcagcaaagtgcaaaacttgagaggcagagacgagaggaag 
ctagaaggagtttacgctctggtttgactgggcttcctcagccaaagaacgagtaccaaa 
tagttgcacaacctcctcctgaggaaagtgaagagccagaagagaaaattgaggaagaca 
tgtcagacaggatagcgagggaaaaggcggaggaagaagcaagacaacaggcattgctta 
agaagagatccaaggtcttgcagagagatcttcctagacccccagctgcttcattggcag 
taattaggaactcgttgctttcagctgatggagacaaaagttctgttgttcctcctactc 
cgattgaggttgcagataaaatggtaagagaggagcttctacagttgctggagcatgata 
atgcaaagtatccgcttgatgacaaagctgagaagaagaaaggagccaagaaccgtacca 
accgttctgcttctcaagttcttgcaattgacgattttgatgaaaatgagctccaagagg 
ctgacaaaatgataaaggaggaggggaagtttctgtgtgtgtcaatgggacatgagaaca 
agacacttgatgattttgtagaagctcacaacacatgcgtgaatgatctcatgtatttcc 
ccactcgaagcgcttacgagctctcaagtgttgctgggaacgcggacaaagttgcagctt 
ttcaggaggagatggagaatgtgagaaaaaagatggaggaggatgagaagaaggcagaac 
acatgaaggccaagtacaaaacttatacaaagggtcatgagaggagggcagagaccgtgt 
ggacccaaatagaggcgacattgaagcaggctgagattggtggaacagaagtagagtgct 
ttaaagcattgaagaggcaagaagagatggctgcatcttttaggaaaaagaatttgcaag 
aggaagtgataaagcaaaaggaaacagagagtaaactgcagactcgctatgggaatatgt 
tggcaatggttgaaaaagcagaggagataatggtcggtttccgagcacaggcattgaaga 
aacaagaggatgttgaagattctcacaaactgaaagaagctaagctagccactggagagg 
aagaggacatagccatagccatggaagcttctgcataaaaacttgagttttgtattgctt 
acaagttttaaggagacgtagcttgactttgtattggtaagtttttttaatatgagtcat 
gactttgtaaaaaggttatgatatattctctgtttgtatgctttgcaagagtcaagaaat 
ttgaatgcttcaggatcaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 

>G217 Amino Acid Sequence {conserved domain in AA coordinates: 8-67) 
MRIMI KGGVWKNTEDE I LKAAVMKYGKNQWAR I S S LLVRKS AKQCKARWYE WLDPS I KKT 
EWTREEDEKLLHLAKLLPTQWRTIAPIVGRTPSQCLERYEKLLDAACTKDENYDAADDPR 
KLRPGEIDPNPEAKPARPDPVI)MDEDEKEMLSEARARLANTO 

LASLQKRRELKAAGIDGRHRKRKRKGIDYNAEIPFEKRAPAGFYDTADEDRPADQVKFPT 
TIEELEGKRRADVEAHLRKQDVARNKIAQRQDAPAAILQ 

QI SDHELEE IAKMGYASDLLAENEELTEGSAATRALLANYSQTPRQGMTPMRTPQRTPAG 
KGDAJMMEAENLARLRDSQTPLLGGENPE 

AGLTPRIGLTPSRDGSSFSMTPKGTPFRDELHINEDMDMQQSAKLERQRREEARRSLRSG 

LTGLPQPKNEYQIVAQPPPEESEEPEEKIEEDMSDRIAREKAEEEARQQALLKKRSKVLQ 

RDLPRPPAASLAVIRNSLLSADGDKSSWPPTPIEVADKMTOEELLQLLEHDNAKYPLDD 

KAEKKKGAKNRTNRSASQVTiAIDDFDENELQEAD 

AHNTCVTODLMYFPTRSAYELSSVAGNADKVAAFQEEME^^ 

YTKGHERRAETVWTQIEATLKQAEIGGTEVECFKALKKQEEMAASFRKKNLQEEVIKQKE 

TESKLQTRYGNMIiAMVEKAEEIMVGFRAQALKKQEDVED^ 

EASA* 

>G2192 (92.. 2971) 

CGGAAAGAGATCAACCAACGATAGAGGAGAAGAAGAACTTGCATACGCAAAAAAACTTTC 
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CCGGGAAAATTCCAGAAACTGCTTTGGAAAAATGTGCGAGCCCGATGATAATTCCGCTAG 

AAACGGCGTCACTACTCAACCTTCGAGGTCAAGGGAGCTTCTAATGGATGTTGACGACTT 

AGATCTTGACGGTTCATGGCCACTAGATCAAATCCCTTACTTATCCTCATCGAATCGCAT 

GATTTCTCCGATTTTTGTCTCCTCTTCCTCTGAGCAGCCTTGCTCGCCTCTCTGGGCTTT 

CTCCGACGGTGGAGGAAATGGTTTTCACCACGCAACCTCCGGTGGCGATGATGAGAAGAT 

CAGCTCTGTCTCCGGTGTTCCTTCTTTeCGTCTCGCCGAGTATCCTCTCTTCCTCCCTTA 

CTCTTCTCCATCAGCAGCTGAGAACACAACAGAGAAGCATAACAGTTTCCAGTTTCCGTC 

TCCATTGATGAGCCTAGTCCCACCAGAGAACACAGACAACTACTGTGTGATCAAAGAGAG 

GATGACTCAGGCGCTTCGATACTTCAAAGAATCAACCGAACAACACGTTTTGGCTCAGGT 

CTGGGCTCCTGTGAGAAAGAATGGTCGTGATTTGCTGACGACTTTGGGTCAACCTTTTGT 

TCTTAATCCTAATGGTAATGGGCTTAATCAATACAGGATGATCTCTCTCACATATATGTT 

TTCTGTGGATAGTGAAAGTGACGTAGAGCTCGGACTCCCGGGTCGAGTTTTCCGTCAGAA 

ATTGCCTGAATGGACTCCAAATGTTCAGTACTATTCCA^ 

TCACGCCTTGCACTACAACGTGCGTGGTACACT^ 

TCAGTCCTGCATAGGTGTTGTGGAACTTATAATGACCTCAGAGAAGATTCACTATGCACC 
CGAAGTGGACAAAGTTTGCAAAGCCCTTGAGGCGGTAAATCTGAAAAGCTCGGAAATACT 
TGATCACCAAACAACACAGATATGCAATGAGAGTCGCCAAAACGCGCTTGCTGAGATTCT 
CGAAGTGCTGACAGTTGTATGTGAGACCCATAACTTGCCTCTCGCTCAGACTTGGGTTCC 
ATGTCAGCATGGGAGCGTTCTTGCCAATGGTGGCGGTCTAAAGAAAAACTGCACCAGCTT 
TGACGGTAGCTGCATGGGTCAAATCTGCATGTCTACAACCGACATGGCCTGCTATGTCGT 
GGATGCTCZATGTCTGGGGCTTTAGAGATGCCTGTCTTGAACACCATCTCCAGAAAGGCCA 
GGGAGTCGCTGGACGAGCTTTTCTCAATGGTGGCT 

GTTCTGCAAAACGCAGTACCCACTAGTCCATTATGCGCTCATGTTG^GTTGACCACTTG 
TTTTGCAATATCTCTCCAGAGCTCTTACACGGGCGACGACAGTTACATTCTTGAATTTTT 
TCTTCCTTCGAGTATAACAGACGACCAAGAGCAAGATTTGCTGTTGGGTTCTATTTTGGT 
GACAATGAAAGAACATTTTCAGAGTCTGAGGGTTGCATCTGGGGTTGACTTTGGTGAAGA 
TGACGACAAATTGTCTTTCGAGATCATCCAAGCATTACCGGACAAGAAGGTTCATTCAAA 
AATAGAATCCATTCGAGTTCCCTTTTCTGGTTTTAAGTCAAATGCAACAGAGACGATGTT 
GATTCCTCAGCCTGTGGTTCAGTCTTCTGATCCAGTAAATGAGAAAATCAACGTGGCCAC 
TGTTAACGGTGTGGTTAAGGAGAAGAAGAAAACAGAGAAAAAGCGTGGGAAGACTGAGAA 
AACAATCAGTCTAGATGTACTTCAGCAGTATTTCACTGGAAGTCTCAAAGACGCTGCAAA 
GAGCCTAGGAGTTTGCCCGACGACAATGAAGCGAATTTGCAGGCAACACGGAATCTCGCG 
GTGGCCATCGAGGAAGATCAAGAAAGTGAATCGTTCAATCACAAAGCTGAAACGAGTCAT 
CGAATCTGTTCAAGGTACTGATGGAGGCCTCGACCTGACTTCCATGGCCGTTAGTTCCAT 
CCCTTGGACACACGGTCAAACATCAGCACAGCCAC^ 

ACCTGAGCTACCAAACACCAATAATTCACCTAACC^TTGGTCAAGTGATCACAGTCCGAA 
CGAGCCAAATGGTTCGCCTGAGTTACCACCAAGCAATGGTCACAAGCGATCACGAACGGT 
GGATGAGAGCGCTGGGACTCCAACCTCTCATGGCTCATGTGACGGTAACCAATTAGATGA 
ACCGAAAGTCCCAAATCAAGATCCGCTCTTCACGGTTGGTGGATCACCCGGGCTCCTTTT 
TCCACCTTATTCTAGAGATCATGATGTATCTGCAGCTTCCTTCGCAATGCCGAACAGGCT 
TCTTGGTTCTATAGACCATTTCCGAGGAATGCTCATTGT^AGACGCTGGAAGTTCAAAAGA 
TCTGAGAAATCTCTGCCCCACTGCAGCATTTGACGATAAGTTTCAAGACACAAACTGGAT 
GAACAATGATAATAATAGCAACAACAACTTATACGCTCCCCCAAAGGAAGAGGCCATTGC 
AAATGTTGCATGCGAACCATCAGGCTCAGAAATGAGAACGGTAACAATCAAAGCAAGTTA 
CAAAGACGACATAATACGGTTCAGAATATCCTCGGGTTCAGGTATAATGGAATTGAAGGA 
TGAAGTGGCTAAGAGGCTGAAAGTTGATGCAGGAACGTTCGATATCAAGTATCTTGACGA 
TGATAACGAATGGGTTTTAATAGCTTGTGATGCTGATCTTCAAGAATGTCTCGAGATCCC 
TAGATCCTCCCGCAeGAAAATCGTAAGGCTCTTAGTTCATGATGTAACGACAAATCTAGG 
GAGCTCCTGCGAGAGCACTGGAGAATTGTGACCTGATAATTCATTCGAACTCTTTTGTAA 
ATAG 

>G2192 Amino Acid Sequence (conserved domain in AA coordinates : 600-700) 

MCEPDDNSARNGVTTQPSRSRELLMDVDDLDLDGSWPLDQIPYLSSSNRMISPIFVSSSS 

EQPCSPLWAFSDGGGNGFHHATSGGDDEKISSVSGVPSFRLAEYPLFLPYSSPSAAENTT 

EKHNSFQFPS PLMSLVPPENTDNYCVI KERMTQALRYFKESTEQHVLAQWAPVRKNGRD 

LLTTLGQPFVLNPNGNGLNQYRMISIiTYMFSVDSESDVELGLPGRVFRQKLPEWTPNVQY 

YSSKBFSRLDHALHYNVRGTLALPVTNPSGQSC^ 

AVNLKSSEILDHQTTQIOTESRQNALAEILEV^^ 
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GGLKKNCTS FDG S CMGQ I CMS TTDMAC YVVD AHVWG FRD ACLEHHLQKGQG VAGRAFLNG 
GSCFCRDITKFCKTQYPLVHYALMFKLTTCFAISLQSSYTGDDSYILEFFLPSSITDDQE 
QDIiLLGS I LVTMKEHFQSLRVASGVDFGEDDDKLSFE 1 1 QALPDKKVHSKI ES IRVPFSG 
FKSNATETMLIPQPWQSSDPWEKINVATWGWKEKKKTEKKRGKTEKTISLDVLQQY 
FTGSLKDAAKSLGVCPTTMKRICRQHGISRWPSRKIKKVNRSITKLKRVIESVQGTDGGL 
DLTSMAVSSIPWTHGQTSAQPLNSPNGSKPPELPNTNNSPNHWSSDHSPNEPNGSPELPP 
SNGHKRSRTVDESAGTPTSHGSCDGNQLDEPKVPNQDPLFTVGGSPGLLFPPYSRDHDVS 
AAS F AM PNRLLGS I DHFRGMLI ED AGS S KDLRNLC PTAAFDDKFQDTNWMNNDNNSNNNL 
YAPPKEEAIANVACEPSGSEMRTVTIKASYKDDIIRFRISSGSGIMELKDEVAKRLKVDA 
GTFDIKYLDDDNEWVLIACDADLQECLEIPRSSRTKIVRLLVHDVTTNLGSSCESTGEL* 
>G504 (69.. 1040) 

CGTCGACCTCTTGACGATCATGAGACTGATTTCGTGAAAATATCGTCATTATATCAAATT 

AGAAGTTGATGGAAAACATGGGGGATTCGAGCATAGGGCCGGGCCATCCGCATCTCCCTC 

CCGGGTTTCGGTTTCACCCGACTGATGAGGAACTAGTAGTTCATTACCrCAAGAAGAAAG 

CAGATTCTGTTCCACTTCCAGTCTCAATCATCGCAGAGATTGATCTTTACAAGTTTGATC 

CTTGGGAGCTTCCAAGCAAGGCGAGTTTTGGAGAGCACGAGTGGTACTTCTTTAGTCCTC 

GGGATCGGAAGTATCCAAATGGGGTTAGGCCAAACCGGGCAGCAACTTCCGGTTATTGGA 

AAGCAACGGGAACCGATAAACCGATATTTACGTGCAATAGTCACAAGGTTGGTGTCAAGA 

AAGCGCTTGTTTTTTACGGTGGAAAGCCTCCTAAAGGGATAAAAACAGATTGGATCATGC 

ATGAATATCGCCTCACTGATGGTAACCTTAGCACTGCGGCTAAGCCGCCTGACTTAACCA 

CGACAAGGAAAAACTCACTACGGCTAGACGATTGGGTTCTATGTAGGATCTATAAGAAGA 

ATAGTTCACAAAGACCAACAATGGAGAGAGTATTACTTAGAGAGGATCTAATGGAAGGCA 

TGCTCTCAAAATCATCTGCTAATTCTTCTTCTACATCAGTACTAGACAACAACGACAACA 

ATAATAACAATAACGAAGAACACTTTTTCGACGGTATGGTCGTTTCTTCAGACAAACGTT 

CCTTGTGTGGTCAATACCGAATGGGCCACGAGGCCTCAGGATCATCTTCATTCGGATCTT 

TCTTATCGAGCAAGAGGTTTCATCATACAGGTGATCTCAACAATGATAACTACAATGTCT 

CTTTTGTTTCGATGCTTAGTGAGATTCCTCAGAGTTCGGGGTTTCATGCAAATGGTGTTA 

TGGATACGACGTCGTCTCTAGCTGATCATGGGGTTTTAAGACAGGCGTTTCAGCTTCCTA 

ACATGAACTGGCACTCATAATCTATATAGATATATATGTGTGTATCATATATGTATCTAT 

GCAGGCCTAATATAGTTTACACATAAATCATCTGGGGCGGCCGCT 

>G504 Amino Acid Sequence (domain in AA coordinates : TBD) 

MENMGDSSIGPGHPHLPPGFRFHPTDEELVVHYLKKKADSVPLPVSIIAEIDLYKFDPW 

LPSKASFGEHEVfYFFSPRDRKYPNGWPNRAATSGYWKATGTDKPIFTCNSHKVGVKK^ 

VF YGGKPPKGI KTDWIMHE YRLTDGNLSTAAKPPDLTTTRKNSLRLDDWVLCRI YKKNS S 

QRPTMERVLLREDLMEGMLS KS SANS S S TS VLDNNDNNNNNNEEHFFDGMWS SDKRS LC 

GQYRMGHKASGSSSFGSFLSSKRFHHTGDLNfcTONYOT 

TSSLADHGVLRQAFQLPNMNWHS* 

>G622 (248.. 2620) 

TCTTTCTTTCTTCAATTCX3CCGTCAAAATCTTCTCTTTCTCTTCCCCCGCCGGTCCTTCA 
CCAATCCTCTGATCTCTCTACACACGAACCTTTGATTTTGACCAACGTCGATGCATGTTC 
AJGACTAGTCTCTTCCTCAATCCTTCAATTTCATCAATTCACGTCGATTTCGTATCCGAT 
TCGTTGTTCTAGCTCTTTGTGTGGTGTTAGGGTTTTAAGATTTTGGAATTGGGGTTTGGA 
GTTTGTGATGTTTGAAGTCAAT^ATGGGGTCAAAGATGTGCATGAACGCTTCATGTGGTAC 
GACTTCTACTGTTGAATGGAAGAAAGGTTGGCCTCTTCGATCTGGTCTTCTCGCTGATCT 
CTGTTATCGTTGCGGATCTGCGTATGAGAGTTCTCTATTCTGTGAACAATTTCATAAGGA 
CCAATCTGGTTGGAGGGAATGCTATTTGTGTAGCAAGAGACTACATTGTGGATGCATTGC 
TTCTAAGGTAACGATTGAGTTAATGGACTATGGTGGTGTTGGTTGTAGTACATGTGCTTG 
CTGCCATCAACTCAftTTTGAACACAAGGGGTGAGAATC 

AATGAAAACGTTAGCTGATAGGCAACATGTAAATGGCGAAAGCGGAGGAAGAAACGAAGG 
CGATCTCTTTTCTCAGCCACTAGTCATGGGCGGAGATAAAAGGGAAGAGTTCATGCCTCA 
CCGTGGGTTTGGTAAGCTAATGAGTCCAGAAAGTACAACCACCGGGCATAGGCTGGATGC 
TGCTGGGGAAATGGATGAATCATCACCTTTACAGCCATCTTTAAATATGGGTTTGGCTGT 
GAATCCGTTTAGCCCATCTTTTGCAACCGAGGCTGTCGAGGGAATGAAACACATCAGTCC 
TTCTCAGTCCAACATGGTCCATTGCTCTGCTTCTAATATACTGCAAAAGCCATCAAGACC 
TGCTATTTCAACTCCTCCTGTGGCTAGTAAATCCGCTCAGGCGCGGATTGGAAGGCCTCC 
TGTCGAAGGGCGAGGGAGAGGCCACTTGCTTCCGCGGTATTGGCCAAAATATACGGATAA 
AGAGGTTCAGCAGATCTCTGGAAATTTGAATTTGAACATTGTACCTCTCTTTGAGAAAAC 
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TCTTAGTGCCAGTGATGCTGGTCGCATTGGTCGTCTAGTTCTTCCAAAAGCCTGTGCAGA 

GGCATATTTTCCTCCGATTAGTCAATCCGAAGGCATTCCTTTGAAAATCCAAGATGTGAG 

GGGTAGGGAGTGGACGTTCCAGTTCAGATATTGGCCCAATAACAATAGTAGAATGTATGT 

TTTAGAAGGTGTCACTCCATGCATACAGTCCATGATGCTACAGGCTGGTGATACAGTAAC 

TTTCAGTCGGGTTGATCCTGGCGGAAAACTAATCATGGGTTCCAGGAAGGCAGCTAATGC 

TGGAGACATGCAGGGTTGTGGGCTCACCAACGGAACATCAACTGAGGACACATCATCGTC 

TGGTGTAACAGAAAACCCACCCTCCATAAATGGTTCCTCGTGTATTTCACTAATACCGAA 

AGAGTTGAATGGTATGCCTGAGAATTTGAACAGTGAGACTAACGGGGGCAGGATAGGTGA 

TGATCCTACACGAGTTAAAGAGAAGAAGAGAACTCGAACCATTGGTGCAAAAAATAAGAG 

ACTTCTTTTGCATAGTGAAGAATCTATGGAGCTGAGACTCACTTGGGAAGAAGCTCAGGA 

CTTGCTTCGTCCCTCTCCTAGTGTAAAGCCTACCATCGTTGTCATTGAGGAGCAAGAAAT 

TGAAGAATATGACGAACCTCCTGTCTTTGGAAAGAGGACTATAGTCACTACAAAACCTTC 

AGGTGAACAGGAACGATGGGCAACTTGCGACGACTGCTCTAAATGGAGAAGGTTACCTGT 

AGATGCTCTTCTTTCCTTTAAATGGACATGTATAGACAATGTTTGGGATGTGAGTAGGTG 

TTCATGTTCTGCACCGGAGGAGAGTCTGAAGGAACTTGAGAATGTTCTTAAAGTAGGTAG 

AGAGCACAAGAAGAGAAGAACTGGGGAAAGACAGGCAGCACAAAGTCAGCAAGAACCGTG 

TGGTTTGGACGCACTGGCGAGTGCAGCAGTCTTAGGAGACACAATAGGCGAGCCAGAGGT 

AGCGACCACGACCAGACATCCAAGGCACAGGGCTGGATGCTCTTGCATCGTGTGCATTCA 

GCCACCAAGTGGGAAAGGTAGGCACAAGCCTACATGTGGCTGCACTGTGTGTAGCACCGT 

GAAGAGAAGGTTCAAGACGCTTATGATGAGGAGGAAGAAGAAGCAGTTGGAGCGCGATGT 

AACAGCAGCAGAAGATAAGAAGAAGAAGGACATGGAACTGGCTGAGTCTGATAAGAGTAA 

GGAGGAGAAGGAAGTGAACACAGCGAGAATAGACCTGAACAGTGATCCATACAATAAAGA 

AGATGTTGAAGCTGTTGCGGTGGAGAAAGAAGAGAGTCGAAAAAGAGCAATAGGACAGTG 

TTCGGGCGTGGTGGCTCAAGACGCCAGTGATGTTTTAGGAGTTACAGAGTTAGAAGGAGA 

GGGTAAGAATGTTCGTGAAGAGCCGAGAGTTTCAAGCTGATATGGAAA 

>G622 Amino Acid Sequence (domain in AA coordinates: TBD) 

MFBVKMGSKMCMNASCGTTSTVEWKKGWPLRSGLLADIiCYRCGSAYESSLFCEQFHKDQS 

GWRECXI1CSKRLHCGCIASKVTIEI1MDYGGVGCSTCACCHQLNI1NTRGENPGVFS 

TLADRQHVNGE SGGRNEGDLFSQPLVMGGDKREEFMPHRGFGKLMS PE STTTGHRLDAAG 

EMHESSPLQPSLNMGI^VNPFSPSFATEAVEGMKHISPSQSNMTO 

STP PVASKS AQAR IGRPPVEGRGRGHLLPRYWPKYTDKEVQQI S GNLNLNI VPLFEKTLS 
ASDAGRIGRLVLPKACAEAYFPPISQSEGIPLKIQDVRGREWTFQFRYWPNNNSRMYVLE 
GVTPCIQSMMLQAGDTVTFSRVDPGGKLIMGSRKAANAGDMQGCGLTNGTSTEDTSSSGV 
TENP PS INGS S C I SL I PKELNGMPENLNS ETNGGRI GDDPTRVKEKKRTRTI GAKNKRLL 
LHSEESMELRLTWEEAQDLLRPSPSVKPTIVVIEEQEIEEYDEPPVFGKRTIVTTKPSGE 
QERWATCDDCSKHRRLPVDALLSFKWTCIDNVWDVSRCSCSAPEESLKELENVLKVGREH 
KKRRTGERQAAQSQQEPCGLDALASAAVLGDTIGEPEVATTTRHPRHRAGCSCIVCIQPP 
SGKGRHKPTCGCTVCSTVKRRFKTLMMRRKKKQLERDVTAAE 

KEVNTARIDLNSDPYNKEDVEAVAVEKEESRKRAIGQCSGWAQDASDVLGVTELEGEGK 

NVREEPRVSS* 

>G778 (50.. 1249) 

TCTCAATAACACAAAACCTTTTAAACTAGTAAAATACACAGATTTTAGGATGAGCCAATG 
TGTTCCAAACTGTCACATCGATGATACTCCGGCAGCAGCCACCACCACCGTCCGCTCCAC 
CACAGCCGCAGACATCCCCATATTAGACTACGAGGTAGCCGAGCTGACGTGGGAGAACGG 
GCAACTAGGCTTGCACGGCTTAGGTCCACCGCGAGTGACGGCTTCGTCGACCAAGTACTC 
CACAGGCGCCGGTGGAACGTTGGAGTCGATAGTGGACCAAGCTACTCGCCTCCCTAACCC 
TAAGCCCACGGATGAGCTCGTCCCGTGGTTCCATCATCGCTCCTCCAGGGCCGCGATGGC 
AATGGACGCGCTTCTCCCTTGCTCCAACCTAGTACACGAGCAGCAGAGCAAGCCTGGTGG 
CGTTGGCTCCACCCGGGTGGGGTCATGTAGCGATGGTCGTACCATGGGCGGTGGAAAACG 
AGCAAGAGTGGCACCGGAGTGGAGCGGCGGCGGGAGTCAGCGGCTGACCATGGACACTTA 
CGACGTAGGTTTCACCTCAACATCAATGGGCTCGCACGATAACACAATCGACGATCATGA 
CTCCGTCTGCCACAGCCGCCCACAGATGGAGGACGAAGAAGAGAAGAAAGCCGGAGGAAA 
ATCATCAGTTTCAACCAAGAGAAGCAGAGCTGCTGCTATTCATAACCAATCCGAACGTAA 
GAGGAGAGATAAAATCAATCAAAGGATGAAGACTTTGCAAAAACTGGTTCCCAATTCCAG 
CAAGACGGATAAAGCATCTATGTTGGATGAAGTGATAGAGTATTTGAAGCAACTTCAAGC 
ACAAGTGAGCATGATGAGCAGAATGAATATGCCTTCTATGATGCTTCCTATGGCCATGCA 
GCAACAACAACAACTACAAA0X3TCTCTCATGTCCAATCCCATGGGTTTAGGGATGGGCAT 
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GGGGATGCCCGGTCTCGGTCTCCTCGACCTTAATTCTATGAACCGAGCTGCTGCAAGCGC 

TCCTAATATCCATGCCAACATGATGCCAAACCCATTTTTGCCCATGAATTG 

GGATGCTTCTTCCAATGACTCTCGATTTCAGTCTCCTCTCATCCCCGATCCTATGTCTGC 

CTTTCTTGCATGCTCTACTCAGCCAACGACGATGGAAGCGTATAGCAGGATGGCTACATT 

ATATCAGCAAATGCAACAACAACTTCCTCCTCCTTCGAATCCAAAATGATTATTACTCAA 

ACACCTCTATATAGTTTACGTCTATATATGTGTTAGTCACATACATACATATATATATTC 

CATCATAATTATTTATTTATATGTATAGGCTTCTCATGAATTATGATATTATACGTATTA 

CGTAAAAAA 

>G778 Amino Acid Sequence (domain in AA coordinates: 220-267) 

MSQCVPNCHIDDTPAAATTTWSTTAADIPILDYEVAELTWENGQLGLHGLGPPRVTASS 

TKYSTGAGGTLESIVDQATRLPNPKPTDELVPWFHHRSSRAAMAMDALVPCSNLVHEQQS 

KPGGVGSTRVGSCSDGRTMGGGKRARVAPEWSGGGSQRLTMDTYDVGFTSTSMGSHDNTI 

DDHDSVCHSRPQMEDEEEKKAGGKSSVSTKRSRAAAIHNQSERK^raKINQRMKTLQKLV 

PNSSKTDKASMLDEVIEYLKQLQAQVSMMS 
GMGMGMPGLGLLDLNSMNRAAASAPNIHANMMPN^ 
PMSAFLACSTQPTTMEAYSRMATLYQQMQQQLPPPSNPK* 
>G791 (173.. 877) 

TTTTCTTTGGGTGTTCCTTCCACCAACGGCAGAAATCGATTCGGCTTAAATCTCCCCCTC 
CTTTCGATCTCTCTGATCGCCGCCGGGAACATTCAATTTCCCGGGAGTTCAACAAAAT^AA 
AAACTCTCCGTTTTTATTTTTCCCCCTTTTTCACCGGTGGAAGTTTCCGGAGATGGTGTC 
ACCCGAAAACGCTAATTGGATTTGTGACTTGATCGATGCTGATTACGGAAGTTTCACAAT 
CCAAGGTCCTGGTTTCTCTTGGCCTGTTCAGCAACCTATTGGTGTTTCTTCTAACTCCAG 
TGCTGGAGTTGATGGCTCGGCTGGAAACTCAGAAGCTAGCAT^AGAACCTGGATCCAAAAA 
GAGGGGGAGATGTGAATCATCCTCTGCCACTAGCTCGAAAGCATGTAGAGAGAAGCAGCG 
ACGGGACAGGTTGAATGACAAGTTTATGGAATTGGGTGCAATTTTGGAGCCTGGAAATCC 
TCCCAAAACAGACAAGGCTGCTATCTTGGTTGATGCTGTCCGCATGGTGACACAGCTACG 
GGGCGAGGCCCAGAAGCTGAAGGACTCCAATTCAAGTCTTCAGGACAAAATCAAAGAGTT 

aaagactgagaaaaacgagctgcgagatgagaaacagaggctgaagacagagaaagaaaa 
gctggagcagcagctgaaagccatgaa;tgctcctcaaccaagttttttcccagccccacc 
tatgatgcctactgcttttgcttcagcgcaaggccaagctcctggaaacaagatggtgcc 

AATCATCAGTTACCCAGGAGTTGCCATGTGGCAGTTCATGCCTCCTGCTTCAGTCGATAC 
TTCTCAGGATCATGTCCTTCGTCCTCCTGTTGCITAATCAAGAAAAATCATCAACCGGTT 
TGCTTCTTGCTTCCGCTTAAAAGAAAAGTCTCCATTTGTTTTGCTCTCCTCTCTTTCTCG 
GCTTTCTTAGTCTTATCCTTTTGCTTTGTCGTGTTATCATCGTAACTGXTATCTGTTGAA 
CAATGATATGACATTGTAAACTCCAATTGCTTCGCGCAATGTTATCTATTCACATGTAAA 
TTTAAGTAGAGTTTGGCAAAAAAAAAA 

>G791 Amino Acid Sequence (domain in AA coordinates: 75-143) 

MVSPENANWICDLIDADYGSFTIQGPGPSWPVQQPIGVSSNSSAGVDGSAGNSEASKEPG 

SKKRGRCESSSATSSKACREKQRRDRLNDKFMELGAILEPGNPPCT 

QLRGEAQKLKDSNSSLQDKIKELKTEKNELRDEKQRLKTEKEKLEQQLKAMNAPQPSFFP 

APPmPTAFASAQGQAPGNKMVPIISYPGVAMWQFMPPASVDTSQDHVLRPPVA* 

>G861 (158.. 880) 

CTTCTTCCTCCTCCTCCATCTCTTCTCTTTACTCTCTCTTTAATCATCTCTCATTCTTGA 

ATCTTGATCCATCAAAATCAATCCCGTTCTCGAAAGATCCATTAAAATCAAAACCTAAGC L 

TCTCTCTCTTGCTTCTAGGGTTTTTTTGTTCGTTGTGATGGCGAGAGAAAAGATTCAGAT 

CAGGAAGATCGACAACGCAACGGCGAGACAAGTGACGTTTTCGAAACGAAGAAGAGGGCT 

TTTCAAGAAAGCTGAAGAACTCTCCGTTCTCTGCGACGCCGATGTCGCTCTCATCATCTT 

CTCTTCCACCGGAAAACTGTTCGAGTTCTGTAGCTCCAGCATGAAGGAAGTCCTAGAGAG 

GCATAACTTGCAGTCAAAGAACTTGGAGAAGCTTGATCAGCCATCTCTTGAGTTACAGCT 

GGTTGAGAACAGTGATCACGCCCGAATGAGTAAAGAAATTGCGGACAAGAGCCACCGACT 

AAGGCAAATGAGAGGAGAGGAACTTCAAGGACTTGACATTGAAGAGCTTCAGCAGCTAGA 

GAAGGCCCTTGAAACTGGTTTGACGCGTGTGATTGAAACAAAGAGTGACAAGATTATGAG 

TGAGATCAGCGAACTTCAGAAAAAGGGAATGCAATTGATGGATGAGAACAAGCGGTTGAG 

GCAGCAAGGAACGCAACTAACGGAAGAGAACGAGCGACTTGGCATGCAAATATGTAACAA 

TGTGCATGCACACGGTGGTGCTGAATCGGAGAACGCTGCTGTGTACGAGGAAGGACAGTC 

GTCGGAGTCTATTACTAACGCCGGAAACTCTACCGGAGCGCCTGTTGACTCCGAGAGCTC 

CGACACTTCCCTTAGGCTCGGCTTACCGTATGGTGGTTAGAGATGGAACAATTCAAAGAA 
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GTTGATGGAGTGAGGAGAGTAATGTAAATCTTTTTAACTCGGTAGTAACAAGAGACAATG 

TCTAAGTAGTGAATTCTCAAATGTTTGTGTAAGTTTCTGCCTATGGAAGAGGCTTTCATT 

TTTATGATTTTCACTATGTATGATCTCTCTTCACTGCATTTCTGGTTAGTAACGGCTTGT 

CACCGATAAACTTTCTCGTTATGGAAAGTTAGAATAAAAAAAAAAAAAAAAAAA 

>G861 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MAREKIQIRKIDNATARQVTFSKRRRGLFKKAEELSVLCDABVALIIFSSTGKLFEFCSS 

SMKEVLERHNLQSKNLEKLDQPSLELQLVENSDHAI^SKEIADKSHRLRQI^GEELQG^ 

IEELQQLEKALETGLTRVIETKSDKIMSEISELQKKGMQLMDENKRLRQQGTQLTEENER 

LGMQICNNVHAHGGAESENAAVYEEGQSSESITNAGNSTGAPVDSESSDTSLRLGLPYGG 

* 

>G938 (1..1755) 

ATGATGATGTTTAACGAGATGGGAATGTATGGAAACATGGATTTCTTCTCTTCCTCCACA 
TCTCTCGATGTGTGTCCATTACCACAAGCTGAACAAGAACCTGTAGTTGAAGATGTCGAC 
TACACCGATGATGAGATGGATGTGGATGAGCTTGAGAAGAGGATGTGGAGAGACAAAATG 
CGTTTGAAACGTCTCAAGGAGCAACAGAGTAAGTGTAAAGAAGGCGTCGATGGTTCGAAA 
CAGAGGCAGTCGCAAGAGCAAGCTAGGAGGAAGAAAATGTCTAGAGCCCAAGATGGGATC 
TTGAAGTATATGTTGAAGATGATGGAAGTTTGTAAAGCTCAAGGCTTTGTTTATGGTATT 
ATTCCTGAGAAGGGTAAGCCTGTGACTGGTGCTTCGGATAATTTGAGGGAATGGTGGAAA 
GATAAGGTTAGGTTTGATCGTAATGGTCCAGCTGCTATTGCTAAGTATCAGTCAGAGAAT 
AATATTTCTGGAGGGAGTAATGATTGTAACAGCI^GGTTGGTCC^CACCGCATACGCTT 
CAGGAGCTTCAGGACACGACTCTTGGTTCGCTTTTATCGGCTTTGATGCAACATTGTGAT 
CCACCGCAGAGACGGTTTCCTTTGGAGAAAGGAGTTTCTCCACCTTGGTGGCCTAATGGG 
AATGAAGAGTGGTGGCCTCAGCTTGGTTTACCAAATGAGCAAGGTCCTCCTCCTTATAAG 
AAGCCTCATGATTTGAAGAAAGCTTGGAAAGTCGGTGTTTTAACTGCGGTGATCAAGCAT 
ATGTCGCCGGATATTGCGAAGATCCGTAAGCTTGTGAGGCAATCAAAATGCTTGCAGGAT 
AAGATGACGGCGAAAGAGAGTGCTACTTGGCTTGCCATTATTAACCAAGAAGAGGTTGTG 
GCTCGGGAGCTTTATCCCGAGTCATGCCCTCCTCTTTCTTCTTCTTCATCATTAGGAAGC 
GGGTCGCTTCTCATTAATGATTGTAGCGAGTATGACGTTGAAGGTTTCGAGAAGGAACAA 
CATGGTTTCGATGTGGAAGAGCGGAAACCAGAGATAGTGATGATGCATCCTCTAGCAAGC 
TTTGGGGTTGCTAAAATGCAACATTTTCCCATAAAGGAGGAGGTCGCCACCACGGTAAAC 
TTAGAGTTCACGAGAAAGAGGAAGCAGAACAATGATATGAATGTTATGGTAATGGACAGA 
TCAGCAGGTTACACTTGTGAGAATGGTCAGTGTCCTCACAGCAAAATGAATCTTGGATTT 
CAAGACAGGAGTTCAAGGGACAACCACCAGATGGTTTGTCCATATAGAGACAATCGTTTA 
GCGTATGGAGCATCCAAGTTTCATATGGGTGGAATGAAACTAGTAGTTCCTCAGCAACCA 
GTCCAACCGATCGACCTATCGGGCGTTGGAGTTCCGGAAAACGGGCAGAAGATGATCACC 
GAGCTTATGGCCATGTACGACAGAAATGTCCAAAGCAACCAAACGCCTCCTACTTTGATG 
GAAAACCAAAGCATGGTCATTGATGCAAAAGC^^ 

AGTGGCAATCAAATGTTTATGCAACAAGGGACGAACAACGGGGTTAACAATCGGTTCCAG 
ATGGTGTTTGATTCGACACCATTCGATATGGCAGCATTCGATTACAGAGATGATTGGCAA 
ACCGGAGCAATGGAAGGAATGGGGAAGCAGCAGCAGCAGCAGCAGCAGCAGCAAGATGTA 
TCAATATGGTTCTGA 

>G938 Amino Acid Sequence (domain in AA coordinates: 96-104) 
MMMFNEMGMYGNMDFFSSSTSLDVCPLPQAEQEPW 

RLKRLKEQQSKCKEGVDGSKQRQSQEQARRKKMSRAQDGILKYMLKMMEVCKAQGF 

IPEKGKPVTGASDNLREWWKDKVRFDRNGPAAIAKYQSEl^ISGGSNDCNSLVGPTPHTL 

QELQDTTLGSLLSALMQHCDPPQRRFPLEKGVSPPWWPNGNEEWWPQLGLPNEQGPPPYK 

KPHDLKKAWKVGVLTAVI KHMSPDIAKIRKLVRQSKCLQDKMTAKESATWLAI INQEEVV 

ARELYPESCPPLS&SSSLGSGSLIilNDCSEYDVEGFEKEQHGFDVEERKPEIVMMHPLAS 

FGVAKMQHFP I KEEVATTVNLEFTRKRKQNNDMNVMVMDRS AGYTCENGQCPHSKMNLGF 

QDRSSRDNHQWCPYRDNRLAYGASKFHMGGMKLVVPQQPVQPIDLSGVGVPENGQKMIT 

ELMAMYDRNVQSNQTPPTLMENQSWIDAKAAQNQ 

MVFDSTPFDMAAFDYRDDWQTGAMEGMGKQQQQQQQQQDVSIWF* 

>G965 (73.. 1956) 

GATTCTCTGTGTATGTCTGAATCCTTACAGGATCCAAGAGCTTTGGAAAAAAGATATAAT 
GAATAACAAGATATGGGTTTAGCTACTAC7VACTTCTTCTATGTCACAAGATTATCATCAT 
CACCAAGGAATCTTTTCCTTCTCTAATGGATTCCACCGATCATCATCAACCACTCATCAG 
GAGGAAGTAGATGAATCCGCCGTCGTCTCCGGTGCTCAAATTCCGGTTTATGAAACCGCC 



205 



WO 03/013227 



206/286 



PCT/US02/25805 



GGAATGTTGTCTGAAATGTTTGCTTACCCTGGCGGAGGTGGCGGCGGTTCCGGTGGAGAG 
ATTCTTGATC^GTCTACTAAACAGTTGCTAGAGCAACAAAACCGT^CAAOUVCAACAAT 
AACTCAACTCTTCATATGTTATTACCAAATCATCATCAAGGTTTTGCTTTCACCGACGAA 
AACACTATGCAGCCGCAGCAACAACAACACTTTACATGGCCATCTTCCTCCTCCGATCAT 
CATCAAAACCGAGATATGATCGGAACCGTCCACGTGGAAGGAGGAAAGGGTTTGTCTTTA 
TCTCTCTCATCTTCATTAGCCGCAGCTAAAGCCGAGGAATATAGAAGCATTTATTGTGCA 
GCCGTTGATGGAACTTCTTCTTCTTCTAACGCATCCGCTCATCATCATCAATTCAATCAG 
TTCAAGAATCTTCTTCTTGAGAATTCTTCTTCTCAACATCATCACCATCAAGTTGTTGGA 
CATTTTGGTTCATCATCATCATCTCCCATGGCGGCTTCTTCATCCATTGGAGGGATCTAC 
ACGTTGAGGAATTCGAAATATACGAAACCGGCTCAAGAGTTGTTGGAAGAGTTTTGTAGT 
GTTGGAAGAGGACATTTCAAGAAGAACAAACTTAGTAGGAACAACTCAAACCCTAATACT 
ACCGGTGGAGGAGGAGGCGGAGGGTCCTCGTCATCGGCCGGAACAGCTAATGATAGTCCT 
CCTTTGTCTCCGGCTGATCGGATTGAACATCAAAGAAGAAAAGTCAAGCTACTATCTATG 
CTTGAAGAGGTGGACCGACGGTACAACCACTACTGCGAACAAATGCAAATGGTAGTGAAC 
TCATTCGACCAAGTAATGGGTTACGGCGCGGCGGTTCCGTACACGACATTAGCTCAAAAG 
GCAATGTCTAGGCATTTCCGGTGTTTGAAAGACGCGGTAGCGGTTCAGCTTAAACGCAGC 
TGTGAGCTTCTAGGGGATAAAGAGGCGGCAGGGGCTGCATCCTCGGGGTTAACCAAAGGG 
GAAACGCCGCGATTGCGTTTGCTAGAGCAGAGTTTGCGTCAGCAACGAGCGTTTCATCAT 
ATGGGTATGATGGAGCAAGAGGCATGGAGACCGCAACGTGGTTTGCCTGAACGCTCCGTT 
AATATCCTTAGAGCTTGGCTATTCGAGCATTTTCTTAATCCGTACCCAAGCGATGCTGAT 
AAGCZACCTCTTAGCACGACAGACTGGTTTATCCAGAAATCAGGTGTCAAATTGGTTCATA 
AATGCTAGGGTTCGCCTATGGAAACCAATGGTGGAAGAGATGTATCAACAAGAAGCAAAA 
GAAAGAGAAG AAG C AG AAGAAG AAAATG AAAAT C AACAACAACAAAG AAG ACAGC AAC AA 
ACAAACAACAACGACACGAAACCCAACAACAATGAAAACAACTTCACTGTCATAACCGCA 
C^U^CTCCAACGACGATGACATCGACACATCACGAAAACGACTCTTCATTCCTCTCTTCC 
GTCGCCGCCGCTTCTCACGGCGGTTCAGACGCGTTCACCGTCGCCACGTGTCAGCAAGAC 
GTCAGTGACTTCCACGTCGACGGAGATGGTGTGAACGTCATAAGATTCGGGACCAAACAG 
ACTGGTGACGTGTCTCTTACGCTTGGTCTACGCCACTCTGGCAATATTCCTGATAAGAAC 
ACTTCTTTCTCGGTTAGAGACTTTGGAGATTTTTAGTCTTCTTTGTTTCTCT^ATTTATTC 
ATC 

>G965 Amino Acid Sequence (domain in AA coordinates: 423-486) 
MGLATTTSSMSQDYHHHQGIFSFSNGFHRSSSTTHQEEVDESAVVSGAQIPVYETAGMLS 
EMFAYPGGGGGGSGGEILDQSTKQLLEQQNRHMNNl^S 
PQQQQHFTWPSSSSDHHQNRDMIGTVHVEGGKGL^ 

TS S S SNAS AHHHQFNQFKNLLLENS S SQHHHHQ WGHFGS S S S S PMAAS SSI GG I YTLRN 
SKYTKPAQELLEEFCSVGRGHFKKNKLSRNNSNPNTTGGGGGGGSSSSAGTANDSPPLSP 
ADR I EHQRRKVKLLSMLEEVDRRYNHYCEQMQMVVNS FDQVMGYGAAVPYTTL AQKAMSR 
HFRCLKDAVAVQLKRSCELLGDKEAAGAASSGLTKGETPRLRLLEQSLRQQRAFHHMGMM 
EQEAWRPQRGLPERSWILRAWLFEHFLNPYPSDADKHLLARQTGLSRNQVSNWFINARV 
RLWKPMVEEMYQQEAKEREEAEEENENQMQRRQQQ 

TMTSTHHENDS S FLS S VAAASHGGSDAFTVATCQQDVSDFHVDGDGVNVIRFGTKQTGDV 

SLTLGLRHSGNIPDKNTSFSVRDFGDF* 

>G1143 (54.. 677) 

AAATAAGAATATAAACACTTTTGTCTGAAAAATTATCAAAGAAGAAGAAATAAATGGGTG 
GAGGAAGCAGATTTCAAGAACCAGTGAGGATGAGCCGTAGGAAACAAGTAACAAAAGAGA 
AGGAAGAAGATGAAAACTTCAAATCTCCAAATCTTGAAGCAGAGAGACGTAGAAGAGAGA 
AGCTTCATTGTCGGCTTATGGCTCTGCGATCTCATGTCCCCATTGTCACCAACATGACTA 
AAGCAAGTATTGTTGAAGATGCGATTACTTACATAGGAGAGCTTCAAAACAATGTTAAGA 
ATCTCTTAGAGACATTTCATGAAATGGAAGAAGCTCCTCCTGAGATTGATGAAGAACAAA 
CGGATCCAATGATAAAACCTGAAGTTGAAACTAGTGATCTTAACGAAGAGATGAAGAAAC 
TCGGAATCGAGGAGAATGTGCAATTGTGTAAGATTGGGGAGAGGAAGTTTTGGTTAAAGA 
TCATAACAGAGAAGAGAGATGGGATCTTTACTAAATTCATGGAGGTTATGAGATTTCTCG 
GATTCGAGATTATCGATATTAGTCTAACAACTTCAAATGGAGCAATTCTTATTAGTGCCT 
CTGTTCAGACACAGGAACTCTGTGATGTTGAACAGACAAAAGATTTTCTTTTGGAAGTTA 
TGAGAAGCAATCCATAAGTATTAATTATATACATCTTGGAAATTTCTTGATCTAATAACA 
TTTCCATTGGTTTTTATTACATTGTTGTTCCATTTTAAATATGATATGATTCAGATGAAA 
AAGAGTTTGTGTTACAAGCCAATGA 
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>G1143 Amino Acid Sequence (domain in AA coordinates : 33 -82) 

MGGGSRFQEPWMSRRKQVTKEK^EDENFKSPNLEAERRRREKLHCRLMALRSHVPIVTN 

MTKASIVEDAITYIGELQNNVKK^LETFHEMEEAPPEIDEEQTDPMIKPEVETSDIjNEEM 

KKLGIEENVQLCKIGERKFWLKIITEKRDGIFTKFMEVMRFIiGFEIIDISLTTSNGAILl 

SASVQTQELCDVEQTKDFLLEVMRSNP* 

>G1190 (209.. 2020) 

TCCTGTCCCAAAACCAAAAGACTTGAGAGTGTGTCTTTAGAGAGAGATCTTCTCTCTTTT 

ATCTTACGACTCTCACTTCTTATCTCAAATCTACTTCAACTCTATTTCCAGTCTCCACAT 

TTTCCCACAAATTTCAACTCTTGTTCTCTTCCTCCAAAGTAAAAAACAAATCGTTGC^ 

TGAGGTTTGGTTTTGGTGTTATAGAATTATGAAGAGCGGGAAGCAATCTTCGCAACCTGA 

AAAGGGTACTTCCAGGATCTTGTCACTGACTGTCCTGTTTATCGCATTTTGCGGTTTCTC 

CTTCTACCTCGGTGGTATATTTTGCTCTC 

AAGGACGACTACAAAGGCTGTAGCTTCCCCTAAAGAACCTACAGCTACTCCTATTCAAAT 
CAAATCCGTTTCTTTCCCGGAGTGCGGGTCAGAGTTCCAAGATTACACCCCGTGCACCGA 
TCCAAAGAGGTGGAAGAAGTATGGTGTCCATCGCTTAAGTTTCTTGGAGCGTCATTGTCC 
TCCGGTATATGAAAAGAATGAGTGTTTGATTCCACCACCAGACGGGTATAAACCGCCTAT 
AAGATGGCCCAAGAGCCGAGAACAGTGTTGGTACAGGAACGTGCCTTATGATTGGATCAA 
TAAGCAAAAGTCTAACCAGCATTGGCTTAAGAAAGAAGGAGATAAGTTCCATTTCCCTGG 
TGGTGGTACCATGTTCCCTCGTGGAGTTAGTCACTATGTTGATTTGATGCAAGATCTGAT 
TCCTGAAATGAAAGACGGAACAGTCAGGACCGCCATTGATACTGGCTGTGGGGTTGCGAG 
CTGGGGAGGCGATCTTTTGGACCGTGGGATACTATCACTCTCTCTTGCTCCAAGAGATAA 
CCATGAAGCTCAGGTTCAATTTGCTCTTGAACGTGGAATTCCTGCGATTCTCGGGATCAT 
CTCTACGCAACXFTCTCCCTTTTCC^ 

TCTTATTCCCTGGAC^GAATTTGGTGGAATCTATTTACTTGAGATTCACCGTATAGTTCG 
ACCTGGAGGTTTTTGGGTTCTTTCTGGTCCACCTGTGAACTATAATAGACGATGGCGTGG 
ATGGAACACAACCATGGAAGATCAGAAATCTGACTACAACAAGCTTCAGTCACTTCTAAC 
CTCCATGTGTTTCAAAAAGTACGCTCAAAAAGATGACATAGCCGTGTGGCAGAAACTCTC 
AGACAAATCTTGCTATGACAAAATCGCTAAGAACATGGAAGCTXACCCTCCCAAATGTGA 
CGACAGTATAGAACCTGATTCTGCTTGGTACACTCCACTCCGTCCTTGCGTGGTTGCCCC 
GACACCTAAAGTCAAGAAGTCTGGTCTCGGATCAATCCCAAAATGGCCCGAGAGGTTACA 
TGTCGCGCCCGAGAGAATCGGTGATGTTCACGGAGGGAGTGCGAACAGTTTGAAACACGA 
, TGATGGTAAATGGAAGAACAGAGTTAAGCATTACAAGAAAGTTTTACCAGCTCTTGGGAC 
AGACAAGATAAGAAATGTTATGGATATGAACACTGTTTATGGAGGTTTCTCTGCGGCCCT 
CATTGAGGATCCCATTTGGGTCATGAACGTTGTATCATCGTACAGCGCAAATTCGCTTCC 
TGTTGTCTTTGATCGCGGTCTCATCGGGACTTACCACGACTGGTGCGAAGCTTTCTCAAC 
GTATCC^VAGAACATATGATCTTCTTCACCTCGACAGTCTTTTTACCTTGGAGAGTCACAG 
GTGTGAGATGAAGTACATTTTGCTAGAGATGGACAGGATCTTGCGGCCGAGTGGATATGT 
TATAATCCGAGAATCGAGTTATTTCATGGACGCAATCACAACGTTAGCGAAAGGGATAAG 
GTGGAGTTGCCGGAGAGAGGAGACTGAGTATGCAGTCAAAAGTGAGAAGATTCTGGTTTG 
CC^GAAAAAGCTATGGTTTTCGTCAAACCAAACCTCTTGATGAGACCACCTGTATCATAG 
TGTTTATCATCTCCTGTGATGCACACTACAGAGAGAAGGATCTAGTCCTTTGAGTCCAAG 
ATATAGCTCTATAAACAATCTCCTTTTTTTGTTCTCTTTAATTTCTTGGGTATTTCACGG 
TATAGATTGATATTATATATTTTTTAATTATATTTTTAATATATAGATATATTAGTATGT 
GGTTTAAACACTATTATTATCAAGGTCTTAAAGATTTGCTTTGCAAGAGTTAAAAAATGT 
TGGAGTAAGGACCTCTTGATTAATAAATTGACTGACGCAGCAAA 

>G1190 Amino Acid Sequence (domain in AA coordinates: entire protein) 

MKSGKQSSQPEKGTSRILSLTVLFIAFCGFSFYLGGIFCSERDKIVAIO)VTRTTTKAVAS 

PKEPTATPIQIKSWFPECGSEFQDYTPCTDPKRWKKYGVHRLSFLERHCPPVYEKNECL 

IPPPDGYKPPIRWPKSREQCWYRNVPYDWINKQKSNQHWLKKEGDKFHFPGGGTMFPRGV 

SHYVDLMQDLIPEMKDGTTOTAIDTGCGVASWGGDLLDRGILSLSLAPRDNHEAQVQFAL 

ERGIPAILGIISTQRLPFPSNAFDMAHCSRCLIPWTEFGGIYLLEIHRIVRPGGFWVLSG 

PPVWYNRRWRGWNTTMEDQKSDYNKLQSLLTSMCFKKYAQKDDIAWQKLSD 

KNMEAYPPKCDDSIEPDSAWYTPLRPCWAPTPKVKKSGLGSIPKWPERLHVAPERIGDV 

HGGSANSLKHDDGKWKNRVKHYKK\n^ 

WSSYSANSLPVVFDRGLIGTYHDWCEAFSTYPRTYDLLHLDSLFTLESHRCEMKYIIiLE 
MDRILRPSGYVIIRESSYFMDAITTLAKGIRWSCRREETEYAVKSEKILVCQKKLWFSSN 

QTS* 
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>G1198 (230.. 1675) 

TCTTTTCAAATTCCAATCATTTGATCAACTAATCAAGAATTAATTATAAGACTTTGCAAT 

CTCTCTCCCTCTCCCTCTCCCTAGCTAGTTCTCTCTTGTGTTTCTTAACTCGAGCTTCTC 

TCAATAGTGATTATCATCTTTTTCATCATTTCAAGATTTAATGTGTTTTGCAGAAAAGAG 

ACTAATCAAGAAGAGATATCATCAATTGAAGCTGTTTTCTTGAGTAGAGATGGCGAACCA 

TAGAATGAGCGAAGCTACAAACCATAACCACAATCATCATCTTCCTTATTCACTTATTCA 

TGGTCTCAACAACAATCATCCATCTTCTGGTTTCATTAACCAAGATGGATCGTCCAGTTT 

CGATTTTGGAGAGCTAGAAGAAGCAATTGTTCTGCAAGGTGTCAAGTATAGGAACGAGGA 

AGCCAAGCCACCTTTATTAGGAGGAGGAGGAGGAGCTACGACTCTGGAGATGTTCCCTTC 

GTGGCCAATCAGAACTCACCAAACTCTTCCTACTGAGAGTTCCAAGTCAGGAGGAGAGAG 

CAGCGATTCAGGATCGGCTAATTTCTCCGGCAAAGCTGAAAGTCAACAACCGGAGTCTCC 

TATGAGTAGCAAACATCATCTCATGCTTCAACCTCATCATAATAACATGGCAAACTCAAG 

TTCAACATCTGGACTTCCTTCCACTTOTCGAACTTTAGCTCCTCCTAAACCTTCGGAAGA 

TAAGAGGAAGGCTACAACTTCAGGCAAACAGCTTGATGCTAAGACGTTGAGACGTTTGGC 

CCAAAATAGAGAAGCTGCTCGCAAAAGCCGTCTTAGGAAAAAGGCGTATGTGCAACAGCT 

AGAATCAAGTAGGATAT^GCTTTCCCAATTGGAGCAAGAACTTCAGCGAGCTCGTTCTCA 

GGGGCTGTTCATGGGTGGTTGTGGACCACCAGGACCTAACATCACTTCCGGAGCTGCAAT 

ATTTGACATGGAATATGGGAGATGGCTAGAGGATGATAACCGGCATATGTCGGAGATTCG 

AACCGGTCTTCAGGCTCATTTATCTGACAATGATTTAAGGTTGATCGTTGACGGTTACAT 

TGCTCATTTTGATGAGATATTCCGATTAAAAGCCGTGGCAGCGAAAGCCGATGTTTTTCA 

CCTCATC^TTGGGACATGGATGTCCCCAGCCGAACGTTGTTTTATTTGGATGGCTGGT^ 

CCGTCCATCCGACCTAATCAAGATATTGGTGTCGCAAATGGATCTATTGACGGAGCAACA 

ACTGATGGGAATATATAGCOTACAACACTCGTCGCAACAAGCAGAGGAGGCTCTCTCGCA 

AGGCCTCGAACAACTTCAGCAATCTCTCATCGATACTCTCGCCGCATCTCCAGTCATTGA 

CGGAATGCAACAAATGGCTGTCGCTCTCGGAAAGATCTCTAATCTCGAAGGCTTTATCCG 

CCAGGCTGATAACTTGAGGCAGCAGACCGTTCACCAGCTGAGGCGGATCTTGACCGTCCG 

ACAAGCTGCACGGTGTTTCCTAGTCATCGGAGAGTACTATGGACGGCTCAGAGCTCTTAG 

CTCCCTTTGGTTGTCACGCCCACGAGAGACACTGATGAGTGATGAAACCTCTTGTCAAAG 

GACGACGGATTTGCAGATTGTTCAGTCATCTCGGAACCACTTCTCCAATTTCTGAATGGA 

ATGAAACTTTGTATAACTAAAAGGCCAAGTTTC^TTGTCTGTCGTAATTTCACCTATT^ 

CTTTAAAGTTGTACTAGAGAAAAGATAGGATCTTCCTTCG 

>G1198 Amino Acid Sequence (domain in AA coordinates: 173-223) 
MANHRMSEATNHNHNHHLPYSLIHGLNNNHPSSG 

IOTEEAKPPLLGGGGGATTLEMFPSWPIRTHQTLPTESSKSGGESSDSGSANFSGKAESQQ 

PESPMSSKHHLMLQPHHimMANSSSTSGLPSTSRTIiAPPKPSEDKRKATTSGKQ 

RRIiAQNREAARKSRLRKKAYVQQLESSRIKLSQLEQELQRARSQGLFMGGCGPPGPNITS 

GAAIFDMBYGRWLEDDNRHMSE IRTGLQAHLSDNDLRLIVDGYIAHFDEI FRLKAVAAKA 

DVFHLIIGTWMSPAERCFIWMAGFRPSDLIKILVSQMDLLTEQQLMGIYSLQHSSQQAEE 

ALSQGLEQLQQSLIDTLAASPVIDGMQQMAVALGKISNLEGFIRQADNLRQQTVHQLRRI 

LTTOQAARCFLVIGEYYGRLRALSSLWLSRPRETLMSDETSCQTTTDLQIVQSSRNHFSN 

F* 

>G1226 (212.. 1159) 

CTGCATTTATTAAGAACAGTTTAGAAAGTGTCAACCCCTAAAGGAATGTTTrrAGTTTAG 
AGGAAAGAGAGAGAAGAAGAAGCAGCAGCAGAAGTTGTTAATTTGAAGACTATTTGAGGA 
AAGACACCTATATCTAAATACTCAAAGTTACAAAAATATTACTTCAGAT^AACAGTTCCAT 
TAGAGAGACTCATAAAGCTTCTCATCTAATTATGAGTGGATTGATGAGTTTTGGTGAATT 
AGAAGACCAATTTGGTCAGATTTCAGACACTACTATGGAAGAGAAGATACCATTTCTGCA 
AATGCTTCAATGC^AGAACACCCTTTTAC^CAACAGAACCAAATCAGTTTCTCCAATC 
ACTTCTCCAGATCCAAACCCTAGAATCAAAGAGCTGTCTCACCCTTGAAACAAACATCAA 
AAGAGATCCGGGTCAAACAGATGACCCGGAAAAGGATCCAAGAACAG7VAAACGGAGCAGT 
AACGGTGAAAGAAAAAAGAAAACGGAAACGTACAAGAGCTCCAAAGAACAAAGACGAAGT 
TGAAAACCAAAGGATGACTCACATTGCCGTCGAACGTAATCGAAGACGACAAATGAACGA 
ACACTTAAACTCTCTCCGATCTCTCATGCCTCCTTCGTTTCTTCAACGGGGTGACCAAGC 
TTCGATTGTAGGAGGGGCAATAGATTTCATCAAGGAACTAGAGCAACTCTTGCAATCTCT 
AGAAGCTGAGAAACGAAAGGATGGAACTGATGAAACTCCTAAAACGGCGTCGTGTTCTTC 
ATCTTCGTCTCTTGCATGCACTAACTCTTCTATTTCTAGCGTGTCTACGACGTCGGAAAA 
TGGATTTACGGCGAGATTCGGCGGTGGAGATACGACAGAAGTGGAGGCTACGGTGATACA 
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GAACCATGTGAGCTTAAAAGTTCGGTGTAAGAGAGGAAAACGACAGATCTTAAAAGCTAT 
TGTCTCGATTGAAGAACTAAAGCTTGCGATTCTACATCTCACTATCTCTTCTTCCTTTGA 
CTTTGTCATCTACTCTTTCAATCTCAAGATGGAAGATGGTTGTAAATTAGGATCAGCAGA 
TGAGATAGCGACAG CCGTTCATCAGATCTTCGAG CAAATCAACGGTGAAGTCATGTGGTC 
AAATCTTAGTCGAACTTAGTTGACTTTTGACTCCTAGTAACGTGTGTAAACTTTAGGTTA 
CAAAGAAAAGGGACGTGATATAAATAAGAAAAACCAAAGAGGTGAAATTTTGGGAGTTTT 
AATTATTATCTTATACTTTTTGGATTTTAGATTAGTAGCAAACTCGCAGTGTTCTACGAT 
GACATTATTATTGGTCACATGAAGGTTTAGGTTAAAAAAAAA 

>G1226 Amino Acid Sequence (domain in AA coordinates : 115-174) 
MSGLMSFGELEDQFGQISDTTMEEKIPFLQMLQCIEHPFTTTEPNQFLQSLLQIQTLESK 
SCLTLETNIKRDPGQTDDPEKDPRTENGAVTVKEK^ 

ERNRRRQMNEHLNSLRSLMPPSFLQRGDQASIVGGAJDFIKELEQLLQSLEAEKRKDGTD 

ETPK^ASCSSSSSLACTNSSISSVSTTSENGFTARFGGGDTTEVEAWIQNHVSLKVRCK 

RGKRQILKAIVSIEELKliAILHLTISSSFDFVIYSFNLKMEDGCKLGSADEIATAVHQIF 

EQINGEVMWSNLSRT* 

>G1451 (124.. 2559) 

TTTGTACTTCCGGAGCTAAAGAGTTATAGCTACTGTAGTAGCTGGAAGTGAAGAAGATTT 
TTTAATAGATTGTACGGAAAAATTAGGGTTTTCAAAGTTTGGTTTCTTGAAGTTGAATTA 
GACATGAAGCTGTCAACATCTGGATTGGGTCAACAGGGTCATGAAGGAGAGAAGTGTCTG 
AATTCTGAGCTATGGCATGCTTGTGCTGGACCATTAGTCTCTCTTCCATCATCTGGTAGT 
CGAGTTGTTTACTTTCCACAGGGTCACAGTGAACAGGTAGCTGCTACAACTAATAAGGAA 
GTTGATGGTCACATACCCAATTACCCAAGCCTACCACCACAATTGATATGCCAGCTCCAT 
AATGTTACAATGCATGCAGATGTTGAGACGGATGAAGTCTATGCTCAAATGACACTTCAA 
CCATTGACACCGGAGGAGCAGAAGGAAACATTTGTACCGATTGAGTTGGGGATACCGAGT 
AAGCAACCTAGTAATTATTTTTGTAAGACTCTCACAGCTAGTGATACCAGTACACATGGA 
GGGTTTTCTGTTCCTAGACGTGCTGCTGAGAAAGTGTTTCCTCCATTGGATTACACACTG 
CAGCCACCAGCTCAAGAACTGATTGCAAGGGATCTCCATGATGTTGAATGGAAGTTTAGG 
CATATCTTTCGGGGACAGCCCAAACGGCATCTCCTAACTACTGGATGGAGTGTCTTTGTC 
AGTGCCAAGCGACTAGTAGCTGGAGATTCTGTCATTTTCATCAGGAATGAAAAGAATCAA 
CTCTTTTTGGGAATTCGTCATGCCACTCGGCCGCAGACTATTGTACCATCATCTGTTTTA 
TCTAGTGATAGCATGCATATTGGACTCCTTGCTGCTGCTGCACATGCTTCTGCAACTAAT 
AGCTGTTTCACTGTTTTCTTTCATCCAAGGGCTAGCCAATCTGAGTTTGTGATACAACTT 
TCCAAGTACATTAAAGCCGTTTTTC^CACGCGTATTTCAGTTGGGATGCGCTTTCGCATG 
CTCTTCGAGACAGAAGAGTCGAGTGTCCGCAGGTACATGGGTACTATAACTGGTATTAGT 
GATCTAGATTCTGTTCGTTGGCCAAACTCTCATTGGCGATCTGTGAAGGTTGGTTGGGAT 
GAATCGACTGCAGGGGAGAGACAGCCAAGGGTTTCTTTATGGGAGATTGAGCCTCTGACT 
ACCTTTCCTATGTATCCATCTCTTTTTCCTCTCAGACTAAAACGTCCATGGCATGCTGGC 
ACATCATCTTTGCCTGATGGAAGGGGTGATTTGGGAAGTGGTCTAACATGGCTAAGAGGG 
GGAGGTGGAGAGCAGC^GGTTTGCTTCCTCTAAATTATCCATCTGTTGGTTTGTTTCCA 
TGGATGCAACAAAGGCTGGATCTCAGTCAAATGGGGACTGATAATAATCAGCAATACCAA 
GCAATGTTAGCTGCTGGGTTGCAGAACATCGGCGGTGGAGATCCTTTAAGACAGCAGTTT 
GTACAGCTGCAAGAGCCTCACCACCAATATCTTCAACAATCAGCTTCCCATAATTCTGAT 
TTGATGCTTCAGCAGCAACAGCAGCAACAAGCGTCACGCCATCTCATGCATGCTCAAACA 
C^GATTATGAGTGAGAATCITCCGC^GCAGAATATGCGACAAGAAGTTAGTAACCAACCA 
GCTGGACAGCAGCAACAGCTACAGCAACCGGACCAAAATGCAT^ 

ATGCAAAATGGCCATCTTCAACAGTGG(^GCAGCAATCAGAGATGCCATCTCCCTCGTTC 
ATGAAGTCAGATTTTACTGACTCAAGCAACAAATTTGCAACAACTGCTAGTCCGGCTTCT 
GGAGATGGCAATCT^TTGAATTTTTCTATAACCGGTCAGTCTGTACTCCCTGAGCAGTTA 
ACAAC^GAGGGCTGGTCTCCAAAAGCATCCAACACTTTTTCTGAACCGTTGTCACTTCCA 
CAAGCCTATCCTGGGAAGAGTCTTGCTCTAGAACCCGGAAATCCGCAGAATCCCTCTCTT 
TTCGGTGTTGATCCCGACTCTGGACTCTTCCTCCCCAGTACGGTTCCCCGCTTTGCTTCT 
TCATCAGGAGATGCTGAAGCTTCCCCTATGTCACTAACAGATTCAGGATTTCAGAATTCC 
TTATATAGCTGCATGCAAGACACAACTCATGAGTTATTGCATGGAGCTGGACAGATTAAC 
TCGTCCAACCAAACCAAGAACTTTGTAAAGGTTTATAAATCTGGTTCGGTTGGGCGTTCA 
TTAGACATCTCCCGATTCAGCAGCTACCACGAGCTGCGAGAAGAGTTAGGGAAGATGTTT 
GCTATCGAAGGGTTGTTGG71AGACCCCCTTAGATCAGGCTGGCAGCTTGTATTCGTTGAC 
AAGGAAAATGATATTCTTCTCCTTGGTGATGACCCATGGGAGTCATTTGTGAATAACGTT 
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TGGTACATAAAGATACTATCACCAGAAGATGTGCATCAAATGGGAGATCATGGAGAAGGC 

AGTGGTGGGTTATTCCCGCAAAACCCGACCCATCTCTAGAAGCTGCTTCGGTGTTAGTCT 

CATCATGCTACAACGCGGGAGCCCTTTGTTTCCCATTTGAAGTCGTTTCCACTCATCTTT 

ATATGCCATTCGTTCGCATCTCTCTCGTTTTGACGTTTTTAGAAAGAAACATAATCATAT 

TTGTGAGTATGGGTCCTGAAACTTTAGGACGTACTTTAGCTTGTATTAGACAGACACTCT 

CGTCATAAACATAAGAACCTTTATGTAGCTGTCTCAGGGTAACTAAACTTTTCTAG 

>G1451 Amino Acid Sequence (domain in AA coordinates: 22-357) 

MKLSTSGLGQQGHEGEKCLNSELWHACAGPLVSLPSSGSRVVYFPQGHSEQVAATTNKEV 

DGHIPNYPSLPPQLICQLH3^TMHADVETDEVYAQMTLQPLTPEEQKETFVPIELGIPSK 

QPSNYFCKTLTASDTSTHGGFSVPRRAAEKVFPPIJ)yTLQPPAQELIARDLHDVEWKTRH 

IFRGQPKRHLLTTGWSVFVSAKRLVAGDSVIFIRNEKNQLFLGIRHATRPQTIVPSSVLS 

SDSMHIGLIJUy^ASATNSCFTWFHPRASQSEFVIQLSKYIKAVFHTRISV 

FETEESSVRRYMGTITGISDLDSVRWPNSHWRSVKVGWDESTAGERQPRVSLWEIEPLTT 

FPMYPSLFPLRLKRPWHAGTSSLPDGRGDLGSGLTWLRGGGGEQQGLLPLNYPSVGLFPW 

MQQRLDLSQMGTDNNQQYQAML7UVGLQNIGGGDPLRQQFVQLQEPHHQYLQQSASHNSDL 

MLQQQQQQQASRHLMHAQTQIMSENLPQQNMRQEVSNQPAGQQQQLQQPDQNAYLNAFKM 

QNGHLQQWQQQSEMPSPSFMKSDFTDS SNKFATTAS PASGDGNLLNFS ITGQS VLPEQLT 

TEGWSPKASNTFSEPLSLPQAYPGKSLALEPGNPQNPSLFGVDPDSGLFLPSTVPRFASS 

SGDAEASPMSLTDSGFQNSLYSCMQDTTHELLHGAGQINSSNQTKNFVKVYKSGSVGRSL 

DISRFSSYHELREELGKMFAIEGLLEDPLRSGWQLVFVDKENDILLLGDDPWESFVNNVW 

YI KILS PEDVHQMGDHGEGSGGLFPQNPTHL * 

>G1478 (1..354) 

ATGTGTAGAGGGTTTGAGAAAGAAGAAGAGAGAAGAAGCGACAATGGAGGATGCCAAAGA 

CTATGCACGGAGAGTCACAAAGCTCCGGTAAGCTGTGAGCTTTGCGGCGAGAACGCCACC 

GTGTATTGTGAGGCAGACGCAGCTTTCCTTTGTAGGAA^TGCGATCGATGGGTCCATTCT 

GCTAATTTTCTAGCTCGGAGACATCTCCGGCGCGTGATCTGCACGACCTGTCGGAAGCTA 

ACTCGTCGATGTCTTGTCGGTGATAATTTTAATGTTGTTTTACCGGAGATAAGGATGATA 

GCAAGGATTGAAGAACATAGTAGTGATCACAAAATTCCCTTTGTGTTTCTCTGA 

>G1478 Amino Acid Sequence (domain in aa coordinates: 32-76) 

MCRGFEKEEERRSDNGGCQRLCTESHKAPVSCELCGENATVYCEADAAFLCRKCDRWVHS 

ANFLARRHLRRVT CTTCRKLTRRCL VGDNFNWLPE IRM IARIEEHS SDHKI P FVFL * 

>G1496 (116.. 1123) 

AAACCCACCAAATAACTCAGAGCTTTTTTGCATTTTTTCCCATTCTCTATTTTGTTTTGT 
ACTTTTGGTCTCACTTTAAAAGATCATAAGTTGAAAGATTTCTGCAGAGAACAATATGTT 
GGAAGGTCTTGTCTCTCAAGAAAGCTTGTCCTTAAACTCTATGGACATGTCTGTACTTGA 
AAGGCTTAAATGGGTACAACAGCAACAACAGCAACTGCAACAAGTTGTGTCCCATAGCAG 
TAATAATTCACCTGAACTTCTTCAGATACTTCAGTTCCATGGAAGCAACAATGATGAGTT 
GTTGGAGAGTAGTTTCAGCCAATTTCAAATGCCT 

CATGGGTTTTGGTCCTCCACATGAATCCATTTCAAGAACAAGTAGCTGCCATATGGAACC 

TGTGGATACAATGGAGGTTTTGTTGAAGACCGGTGAAGAAACCAGAGCCGTTGCCTTGAA 

GAACAAGAGAAAACCAGAGGTTAAGACAAGGGAAGAGCAAAAGACAGAGAAGAAGATCAA 

AGTAGAGGCTGAGACAGAGTCAAGCATGAAAGGAAAATCAAACATGGGAAACACTGAAGC 

ATCTTCAGACACTTCAAAGGAGACATCGAAAGGAGCTTCAGAGAATCAGAAATTAGATTA 

TATCCACGTGAGAGCTCGTCGAGGCCAAGCCACTGACAGACACAGCTTAGCAGAAAGGGC 

GAGAAGAGAAAAGATCAGCAAGAAAATGAAATATCTGCAAGATATTGTGCCTGGATGCAA 

TAAGGTCACAGGAAT^AGCTGGTATGCTTGATGAGATCATCAATTATGTTCAATGTCTCCA 

AAGACAAGTCGAGTTCCTGTCGATGAAACTTGCTGTCTTGAACCCGGAACTAGAGCTTGC 

CGTGGAAGATGTATeCGTAAAACAGGCTTACTTTACAAATGTAGTTGCTTCAAAGCAATC 

AATAATGGTTGATGTGCCATTGTTTCCGTTAGACCAGCAAGGATCTCTAGATTTGTCTGC 

GATAAACCCGAACCAAACGACATCTATCGAAGCTCCATCTGGAAGCTGGGAAACTCAATC 

ACAGAGTCTCTACAACACATCTAGCCTCGGTTTTCATTACTAAGCAAGATTCATTGAAAC 

AACATGGTTGACATCAATCAATCATCAAAATCAGAAGCAAATTCTATTACATTTGCTCAT 

CAAAGTAGTAATTTCGAAATTTGGTTAATGCATTATCCTTTGATCCTTGTTTTCTGATAT 

TTAAACCAGAAGAACTGGAGATAGCAATCCAATGATCTTGTCACCA 

>G1496 Amino Acid Sequence (domain in AA coordinates: 184-248) 

MLEGLVSQESLSLNSMDMSVLERLKWVQQQQQQLQQVVSHSSlSfNSPELLQILQFHGSNND 

ELLESSFSQFQMLGSGFGPNYNMGFGPPHESISRTSSCHMEPVDTMEVLLKTGEETRAVA 
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LKNKRKPEVKTREEQKTEKiCIKVETVETESSMKGKSNMGNTEASSDTSKETSKGASENQKL 
DYIHVRARRGQATDRHSLAERARREKISK 

LQRQVEFIiSMKLAVLNPELELAVEDVSVKQAYFTNWASKQSIMVDVPLPPLDQQGSLDL 

SAINPNQTTSIEAPSGSWETQSQSLYNTSSLGFHY* 

>G1526 (1..3090) 

ATGGGAACGAAAGTCTCAGACGATCTTGTTTCCACCGTCAGATCAGTCGTGGGTTCCGAT 

TACTCAGATATGGATATAATCAGGGCTTTACACATGGCGAATCATGATCCAACGGCTGCT 

ATCAATATAATCTTCGACACTCCAAGTTTCGCCAAACCTGATGTAGCCACTCCTACCCCG 

AGCGGCTCTAATGGAGGGAAGCGAGTTGATAGTGGATTAAAGGGCTGTACTTTTGGTGAC 

AGCGGAAGTGTTGGAGCGAATCATCGCGTGGAGGAAGTUUU^TGAGAGTGTTAATGGTGGA 

GGAGAAGAGAGTGTTTCAGGGAATGAGTGGTGGTTTGTTGGTTGTTCTGAATTGGCTGGG 

TTATCGACATGTAAAGGAAGGAAATTGAAGTCTGGTGATGAATTGGTGTT(^CGTTTCCG 

CATAGTAAAGGATTAAAGCCTGAGACTACGCCTGGGAAGCGCGGTTTTGGGCGGGGAAGG 

CC^GCTTTGCGTGGTGCTTCTGATATCGTTAGGTTCTCTACAAAGGATTCAGGAGAGATT 

GGTAGAATACGAAACGAGTGGGCTCGGTGTCTTCTACCACTTGTGAGAGACAAGAAAATT 

AGGATAGAAGGCAGTTGCAAGTCGGCGCCTGAAGCTTTGAGCATCATGGATACAATTCTT 

CTGTCTGTAAGCGTGTACATTAATAGTTCCATGTTTCAAAAGCATAGTGCGACTTCATTT 

AAGACAGCTAGTAATACGGCAGAGGAATCAATGTTCC^TCCTCTCCCA7y\TCTCTTTCGG 

TTACTCGGTTTGATCCCCTTTAAGAAGGCAGAGTTTACTCCAGAGGATTTTTACTCTAAG 

AAGCGACCTTTGAGTTCCAAGGATGGTTCTGCTATTCCTACTTCGTTGCTTCAATTAAAC 

AAGGTCAAGAATATGAATCAAGATGCAAACGGAGATGAAAATGAGCAGTGTATCAGCGAT 

GGTGATCTTGATAACATTGTTGGTGTTGGGGACAGTTCTGGATTAAAGGAAATGGAAACT 

CCACATACACTTCTGTGTGAGCTTCGTCCATACCAAAAGCAGGCACTTCATTGGATGACC 

CAACTGGAGAAAGGAAATTGCACTGATGAGGCAGCAACAATGCTTCACCCGTGTTGGGAA 

GCATACTGTTTAGCAGACAAGAGGGAACTGGTTGTCTACCTGAATTCTTTTACTGGTGAT 

GCTACAATACACTTCCCTAGCACACTTCAAATGGCAAGAGGAGGAATATTAGCAGACGCA 

ATGGGTCTTGGAAAGACTGTAATGACCATATCCCTTTTGCTTGCCCATTCTTGGAAAGCT 

GCATCAACTGGGTTTCTATGCCCCAACTATGAAGGAGACAAAGTGATCAGCAGTTCTGTA 

GATGATCTCACTAGTCCCCCGGTGAAGGCAACCAAATTTCTAGGCTTTGATAAGAGGCTT 

CTTGAACAAAAAAGTGTACTTCAAAATGGTGGTAACCTGATTGTATGTCCGATGACACTT 

TTAGGACAGTGGAAGACAGAGATTGAAATGCATGCAAAGCCTGGGTCTCTATCTGTCTAT 

GTTCACTATGGGCAAAGCAGGCCGAAGGATGCAAAACTTCTTTCCCAGAGTGATGTGGTA 

ATCACCACATATGGAGTTCTAACATCCGAATTCTCGCAAGAGAACTCAGCAGACCATGAA 

GGAATTTATGCAGTTCGATGGTTTAGGATTGTTCTTGACGAGGCACATACCATCAAAAAC 

TCAAAAAGCCAAATTTCCTTGGCTGCTGCAGCTCTGGTTGCTGATAGGCGTTGGTGTCTT 

ACGGGTACTCCTATTCAGAACAATCTGGAGGATTTATACAGCCTTCTACGGTTTTTGAGG 

ATTGAACCATGGGGAACTTGGGCATGGTGGAATAAACTTGTCCAAAAGCCATTTGAAGAG 

GGTGATGAGAGAGGGTTAAAGCTAGTGCAGTCTATCTTAAAACCTATCATGCTTAGGAGA 

ACAAAGTCTAGCACAGACCGAGAAGGAAGGCCGATTCTTGTTCTACCCCCTGCTGATGCA 

CGGGTCATTTACTGTGAACTTTCGGAGTCTGAGAGGGATTTCTACGACGCGCTATTTAAA 

AGATCCAAGGTCAAATTTGATCAATTTGTTGAACAAGGCAAAGTTCTTCATAACTATGCT 

TCGATCCTGGAACTGCTTTTGCGTCTTCGACAATGTTGTGATCACCCATTTTTAGTAATG 

AGTCGAGGGGATACAGCGGAATACTCTGATCTGAATAAGCTTTCTAAACGTTTCCTTAGT 

GGAAAGTCTTCTGGCTTAGAAAGGGAAGGAAAAGATGTACCGTCAGAGGCTTTTGTTCAG 

GAGGTGGTAGAGGAACTGCGCAAAGGAGAGCAAGGAGAGTGTCCAATATGCCTTGAAGCA 

CTTGAGGATGCTGTATTAACGCCATGTGCTCATAGATTATGTCGTGAGTGTCTCTTGGCA 

AGTTGGAGAAATTCTACTTCTGGGTTATGTCCTGTGTGTAGGAACACTGTAAGCAAACAA 

GAACTCATCACAGCACCAACCGAAAGTAGATTCCAGGTTGACGTGGAAAAG^TTGGGTG 

GAATCATCGAAAATCACTGCTCTTCTGGAAGAGCTTGAAGGTCTTCGTTCTTCAGGCTCT 

AAGAGC^TTCTCTTTAGCCAGTGGACCGCTTTCCTCGATCTCCTCCAAATTCCCCTCTCT 

CGGAATAACTTTTCATTTGTCCGTCTTGATGGCACGCTAAGTCAGCAGCAACGAGAGAAG 

GTCCTTAAAGAATTTTCCGAAGATGGCAGTATCCTGGTACTGTTGATGTCTCTAAAAGCT 

GGTGGCGTTGGGATAAATCTAACAGCTGCGTCCAATGCTTTTGTCATGGATCCATGGTGG 

AACCCAGCGGTAGAGGAACAAGCTGTTATGCGTATTCATCGTATAGGGCAAACTAAGGAA 

GTCAAAATCAGAAGATTCATCGTTAAGGGAACGGTTGAAGAGAGAATGGAGGCGGTTCAG 

GCGAGGAAGCAGAGAATGATCTCTGGGGCTTTAACCGATCAAGAAGTACGAAGTGCACGT 

ATAGAGGAACTCAAGATGTTATTTACCTGA 
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>G1526 Amino Acid Sequence (domain in AA coordinates: 493-620, 864-1006) 

MGTKVSDDLVSTVRSVVGSDYSDMDIIRALHMANHDPTAAINIIFDTPSFAKPDVATPTP 

SGSNGGKRVDSGLKGCTFGDSGSVGANHRVEEENESVNGGGEESVSGNEWWFVGCSELAG 

LSTCKGRKLKSGDELVFTFPHSKGLKPETTPGKRGFGRGRPALRGASDIVRFSTKDSGEI 

GRIPNEWARCLLPLVRDKKIRIEGSCKSAPEALSIMDTILLSVSVYINSSMFQKHSATSF 

KTASNTAEESMFHPLPNLFRLLGLIPFKKAEFTPEDFYSKKRPLSSKDGSAIPTSLLQLN 

KVKNMNQDANGDENEQCISDGDLDNIVGVGDSSGLKEMETPHTLLCELRPYQKQALHWMT 

QLEKGNCTDEAATMLHPCWEAYCLADKRELVVYLNSFTGDATIHFPSTLQMARGGILADA 

MGLGKTVMTISLLLAHSWKAASTGFLCPNYEGDKVISSSVDDLTSPPVKATKFLGFDKR^ 

LEQKSVLQNGGNLIVCPMTLLGQWKTEIEMHAKPGSIjSVYVHYGQSRPKDAKLLSQSDVV 

ITTYGVLTSEFSQENSADHEGIYAWWFRIVLDEAHTIKNSKSQISLAAAALVADRRWCL 

TGTPIQNNIiEDLYSLLRFLRIEPWGTWAWWNKLVQKPFEEGDERGLKLVQSILKPIMLRR 

TKSSTDREGRPILVLPPADARVIYCELSESERDFYDALFKRSKVKFDQFVEQGKVLHNYA 

SILELLLRLRQCCDHPFLWSRGDTAEYSDLNKLSKRFLSGKSSGLEREGKDVPSEAFVQ 

EVVEELRKGEQGECPICLEALEDAVLTPCAHRLCRECLIASWRNSTSGLCPVCRNTVSKQ 

ELITAPTESRFQVDVEKNWVESSKITALLEELEGLRSSGSKSILFSQWTAFLDLIjQIPLS 

Rl^FSFVRLDGTLSQQQREKVLKEFSEDGSILVLLMSLKAGGVGINLTAASNAFVMDPWW 

NPAVEEQAWRIHRIGQTKEVKIRRFIVKGTVEERMEAVQARKQRMISGALTDQEVRSAR 

ieelkmlft* 

>G1543 (1..828) 

ATGATAAAACTACTATTTACGTACATATGCACATACACATATAAACTATATGCTCTATAT 

CATATGGATTACGCATGCGTGTGTATGTATAAATATAAAGGCATCGTCACGCTTCAAGTT 

TGTCTCTTTTATATTAAACTGAGAGTTTTCCTCTCAAACTTTACCTTTTCTTCTTCGATC 

CTAGCTCTTAAGAACCCTAATAATTCATTGATCAAAATAATGGCGATTTTGCCGGAAAAC 

TCTTCAAACTTGGATCTTACTATCTCCGTTCCAGGCTTCTCTTCATCCCCTCTCTCCGAT 

GAAGGAAGTGGCGGAGGAAGAGACCAGCTAAGGCTAGACATGAATCGGTTACCGTCGTCT 

GAAGACGGAGACGATGAAGAATTCAGTCACGATGATGGCTCTGCTCCTCCGCGAAAGAAA 

CTCCGTCTAACCAGAGAACAGTCACGTCTTCTTGAAGATAGTTTCAGACAGAATCATACC 

CTTAATCCCAAACAAAAGGAAGTACTTGCCAAGCATTTGATGCTACGGCCAAGACAAATT 

G7VAGTTTGGTTTCAAAACCGTAGAGCAAGGAGCAAATTGAAGCAAACCGAGATGGAATGC 

GAGTATCTCAAAAGGTGGTTTGGTTCATTAACGGAAGAAAACCACAGGCTCCATAGAGAA 

GTAGAAGAGCTTAGAGCCATAAAGGTTGGCCCAACAACGGTGAACTCTGCCTCGAGCCTT 

ACTATGTGTCCTCGCTGCGAGCGAGTTACCCCTGCCGCGAGCCCTTCGAGGGCGGTGGTG 

CCGGTTCCGGCTAAGAAAACGTTTCCGCCGCAAGAGCGTGATCGTTGA 

>G1543 Amino Acid Sequence (domain in AA coordinates: 135-195) 

MIKIiLFTYICTYTYKLYALYHMDYACVCMYKYKGIVTLQVCLFYIKLRVFLSNFTFSS 

LALKNPNNSLIKIMAILPENSSNIjDLTISVPGFSSSPLSDEGSGGGRDQLRLDMNRLPSS 

edgddeefshddgsapprkklrltreqsrlledsfrqnhtlnpkqkevlakhlmlrprqi 
evwfqnrrars klkqteme ce ylkrwfgslteenhrlhreveelrai kvgpttvns as s l 
tmcprcervtpaaspsrawpvpakktfppqerdr* 

>G162 (101. .619) 

AGACATACAACACCAAAATCTTCTTCTTCACCAACATATTCACTTTCACAGCAAAAAAAA 

ACGAGAGGTTCTCTCTTATTCGTACCGTTTTAGCAAACAAATGGGTCGGAGAAAGATCAA 

GATGGAGATGGTTCAGGACATGAACAC7VCGACAGGTTACCTTTTCAAAACGGAGGACTGG 

TTTGTTCAAGAAGGCGAGCGAGTTAGCCACGCTCTGCAACGCTGAGTTGGGCATCGTTGT 

CTTTTCACCAGGAGGCAAGCCTTTCTCCTACGGGAAACCGAATCTTGATTCTGTTGCAGA 

GCGATTCATGAGAGAATATGATGATTCAGACAGTGGCGATGAAGAAAAAAGTGGTAATTA 

CAGGCCTAAACTGAAGAGGCTGAGTGAACGTCTCGATTTGCTCAACCAAGAGGTTGAAGC 

TGAGAAGGAACGAGGCGAGAAGAGTCAGGAGAAGCTTGAATCTGCTGGGGATGAGAGATT 

CAAGGAGTCCATTGAGACGCTTACCCTCGATGAACTCAATGAATACAAAGATAGGCTTCA 

GACAGTCCATGGTAGGATTGAAGGTCAAGTCAATCACTTGCAGGCTTCGTCTTGCCTCAT 

GCTTCTCTCCAGAAAATAGCTAGACCGACTTGTTAGAGTTACATTCTATTTTTTGTATCA 

GCCTACAGAACTTACCAACAC^TGAAAGTTATTGCTGGTGTAGAATTTTCTGTCATCTAT 

GGGGTGTGACTTTCTATTTGACATCAAATGAAAATGTACCTGGAAATTTGTCTGTATTAA 

TCTCAAGTGTACTTGCTAAACTTGATCAGCTTTTTCGCAAT^AAAAA 

>G162 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKIKME^QDMNTRQVTFSKRRTGLFKKASELATLCNAELGIWFSPGGKPFSYGKP 
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NLDSVAERFMREYDDSDSGDEEKSGNYRPKLKRLSERLDLLNQEVEAEKERGEKSQEKLE 
SAGDERFKESIETLTLDELNEYKDRLQTVHGRIEGQVNHLQASSCLMLLSRK* 
>G1640 (168.. 1196) 

TTCGCCAGATCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGTTTCGCTGACA 
AGCTGCTCTAGCTTATCTGGTACCGTCGACCTCTCACTCAAGGGTCCAAAAGTGTTTTCT 

CGTGTTGTGAGAAAATCGGGTTGAAGAGAGGGAGATGGACAGCCGAGGAAGATGAGATCC 
TCACCAAGTATATTCAGACCAATGGTGAAGGTTCTTGGCGATCTTTGCCTAAGAAAGCTG 
GATTGTTGAGATGTGGAAAGAGCTGTAGACTAAGGTGGATAAACTACTTAAGAAGAGACT 
TAAAAAGAGGAAATATTACTTCCGACGAAGAAGAAATAATCGTCAAGTTGCATTCCCTTC 
TCGGCAACAGATGGTCACTTATTGCAACACATCTACCAGGAAGAACAGACAACGAAATTA 
AAAACTATTGGAACTCACATCT(^GCCGCAAAATCTATGCCTTCACTGCCGTTTCCGGAG 
ATGGACACAATCTACTCGTCAACGATGTAGTCTTGAAGAAATCTTGTTCATCGTCTTCTG 
GAGCCAAGAACAATAACAAGACCAAGAAGAAGAAGAAGGGAAGGACTAGTAGGTCATCCA 
TGAAGAAACACAAGCAAATGGTGACGGCCTCACAATGTTTCTCACAACCTAAGGAGCTAG 
AGAGTGATTTCAGTGAGGGAGGGCAAAATGGTAATTTTGAAGGAGAGTCTTTGGGGCCTT 
ATGAGTGGTTGGATGGTGAGCTAGAACGGCTCTTGAGTAGTTGTGTCTGGGAATGCACTA 
GTGAAGAGGCTGTGATTGGAGTAAATGATGAAAAGGTGTGTGAGAGTGGGGACAATAGTA 
GTTGTTGTGTTAATTTGTTTGAAGAAGAACAAGGAAGCGAGACAAAGATTGGTCACGTAG 
GAATCACAGAGGTTGATCATGATATGACGGTGGAAAGAGAAAGAGAGGGAAGTTTTTTAA 
GTTCGAATTCAAATGAAAATAATGATAAAGATTGGTGGGTTGGTCTATGTAATTCTTCAG 
AAGTTGGGTTTGGGGTTGATGAGGAGTTGCTTGATTGGGAGTTTCAAGGTAATGTCACTT 
GTCT^AAGTGATGATCTATGGGATCTCTCAGATATTGGAGAGATAACATTGGAGTGATTGT 
ACCGAGCAAGTGGATTGGCGGCCGCTCTAGACAGGCCTCGTACCGGATCTCTAGCTAGAG 
CTTTCGTTCGTATCATCGGTTTCGACAACGTTCGTCAAGT 

>G1640 Amino Acid Sequence (domain in AA coordinates: 14-115) 

MGRAPCCEKIGLKRGRWTAEEDEILTKYIQTNGEGSWRSLPKKAGLLRCGKSCRLRWINY 

LRRDLKRGNITSDEEEIIVKLHSLLGNRWSLIATHLPGRTDNEIKNYWNSHLSRKIYAFT 

AVSGDGHNLLVND WLKKSCS S S SGAKIJNNKTKKKKKGRTSRS SMKKHKQMVTAS QCF SQ 

PKELESDFSEGGQNGNFEGESLGPYEWLDGELERLLSSCVWECTSBEAVIGVNDEKVCES 

GDNSSCCVNLFEEEQGSETKIGHVGITEVDHDMTVEREREGSFL 

CNSSEVGFGVDEELLDWEFQGNVTCQSDDLWDLSDIGEITLE* 

>G1644 (1..348) 

ATGAAATTGATTGATTGGAAAGACTGTGCTTTGATGACTTACACCGAACTCATTTTGGGT 
TTCTGCAATGTTTTAATGTTGATCTGCAGGAGGACTAGTGGACCTATGAGACGAGCAAAA 
GGTGGTTGGACTCCAGAGGAGGATGAGACACTTAGACGAGCAGTTGAAAAGTATAAGGGG 
AAGAGGTGGAAGAAAATAGCGGAATTTTTCCCAGAGAGAACACAAGTCCAATGCTTGCAC 
AGGTGGCAGAAAGTTCTTAATCCAGAGCTTGTTAAAGGACCTTGGACTCAAGAGGTTCTC 
TTATCATTTTCATGTTCTGAAACTTTTTTTGGTTTTCATTTTACGTAA 

>G1644 Amino Acid Sequence (conserved domain in AA coordinates : 39-102) 
MKL IDWKDCALMTYTEL ILGFCNVLML I CRRTS GPMRRAKGGWTPEEDETLRRAVEKYKG 
KRWKKIAEFFPERTQVQCLHRWQKVLNPELVKGPWTQEVLLSFSCSETFFGFHFT* 
>G1646 (34.. 786) 

GATCTTTTGATCCAATCACAAGGCAAAGATCCAATGGACAATAACAACAACAACAACAAC 
CAGCAACCACCACCAACCTCCGTCTATCCACCTGGCTCCGCCGTCACAACCGTAATCCCT 
CCTCCACCATCTGGATCTGCATCAATAGTCACCGGAGGAGGAGCGACATACCACCACCTC 
CTCCAGCAACAAOVGCAACAGCTTCA^TGTTCTGGACATACCAGAGACAAGAGATCGAA 
CAGGTAAACGATTTCAAAAACCATCAGCTCCCTCTAGCTCGTATCAAAAAAATCATGAAA 
GCTGATGAAGATGTGCGTATGATCTCCGCCGAAGCACCGATTCTCTTCGCGAAAGCTTGT 
GAGCTTTTCATTCTCGAACTTACGATTAGATCTTGGCTTCACGCTGAAGAGAACAAACGT 
CGTACGCTTCAGAAAAACGATATCGCTGCTGCGATTACTAGAACCGATATCTTCGATTTC 
CTTGTTGATATTGTTCCTAGGGAAGAGATCAAGGAAGAGGAAGATGCAGCATCGGCTCTT 
GGTGGAGGAGGTATGGTTGCTCCCGCCGCGAGCGGTGTTCCTTATTATTATCCACCGATG 
GGACAACCGGCGGTTCCTGGAGGGATGATGATTGGAAGACCGGCGATGGATCCTAGCGGT 
GTTTATGCTCAGCCTCCTTCTCAGGCATGGCAAAGCGTTTGGCAGAATTCAGCTGGTGGT 
GGTGATGATGTGTCTTATGGAAGTGGAGGAAGTAGCGGCCATGGTAATCTCGATAGCCAA 
GGGTAAGTGAATTCTAGTAG 
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>G1646 Amino Acid Sequence (domain in AA coordinates: 72-162) 

MDNNNNNl^QQPPPTSVYPPGSAVTTVIPPPPSG 

VTTYQRQEIEQVNDFKNHQLPLARIKKIMKADEDVRMISAEAPILFAKA 

WLHAEENKRRTLQKOTIAAAITRTDIFDFLVDIVPREEIKEEEDAASALGGGGMVAPAAS 

GVPYYYPPMGQPAVPGGMMIGRPAMDPSGVYAQPPSQAWQSVWQNSAGGGDDVSYGSGGS 

SGHGNLDSQG* 

>G1672 (239.. 1399) 

CCATTCCTGACGTCCGGGATGACGCACAATCCCACTATCCTTCGCAAGACCCTTCCTCTA 

TATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTGACAAGCTGACTCTAGCAGATCTG 

GTACCGATCACTCCCGTCTTTATCAAATTCTTCTTCCTCTTACATTTTCCCTATCCAATC 

GATCTCACGCAGATCTGATCAATTTCTCATCAAATCATTTAGAGATCAAAAGAAAACTAT. 

GAAGAATAGTAAATGTAACCTCATAGATTCAAAGCTCGAAGAACATCATCATCTTTGCGG 

ATCAAAACATTGTCCTGGATGTGGTCGCATGATTCAAGCTGCTACTAAACCAAATTGGGT 

TGGATTGCCGGCAGGAGTGAAATTCGATCCGACAGATCAAGAACTTATAGAACA 

AGCAAAAGTGAAGGGAAAAGAAGAAAATAAGAAATGGTCGTCGTCTCATCCACTTATAGA 

TGAATTTATTCCCACCATTGATGGAGAAGATGGAATATGTTACACTCATCCTCAGAAGCT 

TCCAGGGGTGACAAGAGATGGCTTGAGCAAACACTTCTTCCACAAACCATCAAGAGCTTA 

CACAACCGGAACAAGAAAACGACGTAAAATAATTCAAACCGATC^ 

CGGATCATCAGAAACCAGGTGGCACAAAACGGGCAAAACAAGACCGGTTATGATCAACGG 
TCAACAAAGAGGATGCAAGAAGATATTAGTACTCTACACTU^CTTCGGCAAGAATCGTCG 
ACCGGAGAAAACAAATTGGGTGATGCATCAATATCATTTAGGGATTAATGAGGAAGAGAG 
AGAAGGAGAACTTGTGGTCTCCAAGATATTTTATCAGACACAACCAAGACAGTGTGTTAG 
TAATACTAATTGGTCTGATCACCATGGTTCCAAGGACGTGATCGGAATTGGTGTCGGAGA 
TGAGATTTCCAGCGTAGCTGCCACGTTGCAGAGTCTTGGCTCCGGTGACGTCGTTTCTAG 
GGTTAATATGCATCCCCATACAAGATCCTTTGATGAGGGGACAGCCGAAGCTTCAAAGGG 
AAGAGAGAACCAGCATGTGTCTGGCACGTGCGAGGAAGTACATGATGGGATCATAACATC 
ATCAATGTCATCTCATCATATGATTCATGATCATCATAATCAACATCATCAAATCGGAGA 
TAGAAGAGAATTTC^CATGTCATC^TCATATCCCATGACCCCrrACTATCACATCACAAC^ 
TGAGTCAATCTTCCATGTTACy^GTACTATGCCCTTTCAGCGGCAGCAATTAAGGGGTCG 
GTCGTCTGGTTCGGGATTAGAAGACCTAATTATGGGTTGTACCACAGCTACGTGTACAGA 
AGACAATAATCACAAATGATTAAATTCGCAGGAGCATTCAGAAGCAAACCCTCAGCGAAA 
TGCAGAGTGGTTAACGTTTCCACAATTCTGGAACCAAGCCGAATCAGATGATCAAAACCG 
AAGATTTTAACAGAACCAAAAGGAAGCAGAGAAATCTTGCAT^AAAGCTCCTGCTTAGCTG 
TTGATCAATGCCGGAAATGCTGAGCTATGACTGACTAGTCTCTGCCATTTAACTTACAAT 
ATCACCAGAGGTTGCGATGAATGTTGATTCGCTCAAAGGAGAGCGGCCGCTCTAGACAGG 
CCTCGTACCG 

>G1672 Amino Acid Sequence (conserved domain in AA coordinates: 41-194) 
MKNSKCNL IDS KLEEHHHLCGSKHCPGCGRM IQAATKPNWVGLPAGVKFDPTDQELI EHL 
EAKVKGKEENKKWSSSHPLIDEFIPTIDGEDGICYTHPQKL^ 

YTTGTRKRRKIIQTDHDSELTGSSETRWHKTGKTRPVMINGQQRGCKKILVliYTNFGKNR 
RPEKTNWVTtfHQYHLGINEEEREGELWSKIFYQ 

DEISSVAATLQSLGSGDWSRVNMHPHTRSFDEGTAEASKGRENQHVSGTCEEV1IDGIIT 
SSMSSHHMIHDHHNQHHQIGDRREFHMSSSYPMTPTITSQHESIFHVTSTMPFQRQQLRG 
RSSGSGLEDLIMGCTTATCTEDNNHK* 
>G1677 (24.. 1037) 

CAGTACTAATTCTGTGTGTGTTAATGGTTCTAGTTATGGATGATGAAGAGAGTAACAACG 
TTGAAAGATATGACGACGTCGTATTGCCAGGGTTTAGGTTCCATCCCACTGATGAAGAAC 
TCGTAAGTTTCTACTTGAAACGGAAGGTTTTACACAAATCTCTTCCCTTTGATCTCATCA 
AGAAAGTCGACATTTACAAATACGATCCATGGGACCTCCCAAAGCTTGCAGCGATGGGGG 
AAAAAGAGTGGTACTTTTATTGTCCTAGAGACAGGAAATACCGCAACAGCACAAGACCTA 
ACCGAGTAACTGGAGGTGGCTTCTGGAAAGCAACCGGAACAGACCGGCCTATATACTCAT 
TGGACTCCACTCGATGCATCGGTTTGAAGAAATCACTTGTGTTCTACCGTGGTCGAGCTG 
CTAAAGGAGTCAAAACCGATTGGATGATGCATGAATTTCGTCTCCCTTCTCTCTCTGACT 
CTCATCACTCATCATATCCCAATTACAATAACAAGAAGCAACACCTTAACAATAACAACA 
ACAGGAAGGAGCTTCCTTCAAACGATGCTTGGGCGATATGTAGAATATTTAAGAAGACAA 
ATGCAGTATCCTCACAAAGATCAATCCCACAATCTTGGGTTTATCCAACGATTCCTGACA 
ACAATCAACAGTCACACAACAACACCGCAACTCTCTTAGCTTCATCAGACGTTCTCAGCC 
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ACATATCAACAAGACAAAACTTTATTCCTTCTCCAGTCAACGAACCCGCAAGCTTCACAG 
AATCAGCTGCTTCTTACTTCGCGTCTCAGATGCTCGGAGTCACGTACAATACAGCCAGAA 
ACAACGGAACAGGGGATGCTCTGTTTCTGAGAAACAATGGAACAGGGGATGCTCTGGTTC 
TGAGCAACAATGAGAATAACTACTTCAACAACTTGACTGGAGGGTTGACTCATGAGGTTC 
CGAATGTAAGATCAATGGTGATGGAGGAGACTACGGGGAGTGAGATGTCGGCGACGTCGT 
ATTCCACTAACAATTAAGATCATAGTACTATTAACACTTGAATTAGTGTAGACGTTGATC 
ATCGCTAATATGTATTAATTTTTCTTGTCTTACTATAAACGAAAAAAAAA 

>G1677 Amino Acid Sequence (conserved domain in AA coordinates : 1.7-181) 
MVLVTODEESNNVWYDDWLPGFRFHPTO 

DPWDLPKLAAMGEKEWYFYCPRDRKYRNSTRPNRVTGGGFWKATGTDRPIYSLDSTRCIG 
LKKSLVFYRGRAAKGVKTDWMMHEFRLPSLSD^ 

DAWAICRI FKKTNAVSSQRS I PQS WYPTIPDNNQQSHNNTATLLASSDVLSHI STRQNF 
IPSPVNEPASFTESAASYFASQMLGWYNTARttNGTGDA^ 
FNNLTGGLTHEVPNVRSMVMEETTGSEMSATSYSTNN* 
>G1765 (139.. 966) 

TCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGCTG 

ACAAGCTGACTCTAGCAGATCTGGTACCGTCGACAAGAATGACTTGATTGGTGTTCTAAA 

GAGATCGATGTAGTGAAGATGAGTGGCGAAGGTAACTTAGGTAAGGATCATGAAGAAGAA 

AACGAAGCACCACTTCCTGGGTTC^GGTTTCATCCGACGGATGAAGAGCTTTTAGGATAC 

TATCTTCGAAGAAAAGTAGAGAACAAAACCATCAAACTCGAACTTATCAAACAGATCGAT 

ATCTATAAGTACGATCCTTGGGATCTTCCAAGAGTGAGCAGCGTCGGAGAAAAGGAGTGG 

TACTTCTTCTGCATGAGAGGTAGGAAATACAGGAATAGCGTTCGACCAAACCGAGTGACC 

GGTTCAGGTTTCTGGAAAGCCACTGGTATTGATAAACCGGTTTACTCCAATCTTGACTGT 

GTTGGTCTCAAGAAATCTCTGGTTTACTATCTTGGTTCAGCCGGTAAAGGCACCAAAACC 

GATTGGATGATGCATGAATTCCGCCTCCCCTCCACCACGAAAACCGACTCTCCAGCTCAA 

CAAGCAGAGGTATGGACACTTTGCAGAATCTTCAAACGAGTCACATCTCAAAGAAACCCA 

ACCATCTTACCACCAAACCGAAAACCGGTTATCACTTTAACCGACACTTGTTCTAA 

AGCAGCTTAGATTCCGACCACACGAGCCACCGTACAGTAGATTCCATGTCCCACGAGCCG 

CCGCTTCCACAGCCACAGAATCCTTATTGGAACCAACATATAGTTGGTTTTAATCAACCG 

ACATATACTGGTAATGATAATAACCTCCTGATGAGTTTCTGGAACGGCAACGGTGGAGAT 

TTCATAGGAGACTCAGCAAGTTGGGATGAACTTAGATCTGTTATAGATGGCAACACTAAA 

CCCTAGTAATAAAGTTTCCTTTTTTCAGCTTTGTACAAAAAGATAAAACAAACGGCAACC 

GCTCTAGACAGGCCTCGTACCGGGATCCTCTAGCTAGAGCTTTCGTTTCGTATCATCGGT 

TTCGACAACGTTCGT 

>G1765 Amino Acid Sequence (conserved domain in AA coordinates: 20-140) 

MSGEGNLGKDHEEENEAPLPGFRFHPTDEELLGYYLRRKVE^ 

WDLPRVSSVGEKEWYFFCMRGRKYRNSWPNRVTGSGFWKATGIDKPVYSNLDCT 

LVYYLGSAGKGTKTDWMMHEFRLPS TTKTDS PAQQAEVWTL CRI FKRVTSQRNPTILPPN 

RKPVITLTOTCSKTSSLDSDHTSHRTVT^SMSHEPP 

NWLLMSFWNGNGGDFIGDSASWDELRSVIDGNTKP* 

>G1777 (97.. 1878) 

CTCGTACTTTATCACCTCCGTCGTTCTATAATACTCTCTTCCGTCAATCATATCATTTGT 
CGACAATTTCATTCTGATCAGTTTAAAAATTGATCCATGGATGATAATTTAAGCGGCGAG 
GAAGAAGATTACTATTACTCCTCCGATCAGGAATCTCTCAACGGGATTGATAATGATGAA 
TCCGTTTCGATACCTGTTTCTTCCCGATCAAATACTGTCAAGGTTATTACGAAGGAATCA 
CTTTTGGCTGCACAGAGGGAGGATTTGCGGAGAGTGATGGAATTGTTATCGGTTAAGGAG 
CACCATGCTCGGACTCTTCTTATACATTACCGATGGGATGTGGAGAAGTTGTTTGCTGTT 
CTTGTTGAGAAAGGGAAAGATAGCTTGTTTTCTGGTGCTGGTGTTACACTTCTTGAAAAC 
CAAAGTTGTGATTCTTCCGTTTCTGGTTCTTCTTCGATGATGAGTTGTGATATCTGCGTA 
GAGGATGTACCGGGTTATCAGCTGACAAGGATGGACTGTGGCCATAGCTTTTGCAATAAC 
TGTTGGACTGGGCATTTTACTGTAAAGATAAATGAAGGTCAGAGCAAAAGGATTATATGC 
ATGGCTCATAAGTGTAATGCTATTTGTGATGAAGATGTTGTCAGGGCTCTAGTTAGTAAA 
AGCCAACCAGATTTAGCTGAGAAGTTTGATCGTTTTCTTCTTGAGTCGTATATCGAAGAT 
AACAAAATGGTGAAGTGGTGTCCGAGTACTCCTCATTGTGGGAATGCCATACGTGTTGAG 
GATGACGAGCTCTGTGAGGTTGAATGCTCTTGTGGTTTGCAGTTCTGTTTCAGTTGTTCA 
TCTCAAGCTCACTCCCCTTGCTCTTGTGTGATGTGGGAACTATGGAGAAAGAAGTGCTTT 
GATGAGTCCGAGACTGTTAATTGGATAACTGTTCACACAAAGCCGTGTCCCAAATGTCAC 
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AAGCCTGTTGAAAAGAATGGTGGATGCA^TCTCGTGACTTGTCTTTGTCGACAATCTTTT 
TGTTGGTTGTGTGGTGAAGCTACTGGAAGGGACCACACTTGGGCTAGAATCTCGGGTCAT 
AGTTGTGGTCGGTTCCAAGAAGATAAAGAGAAACAAATGGAGAGAGCGAAAAGGGATCTC 
AAGCGGTATATGCATTATCATAACCGATACAAAGCACATATCGACTCCTCCAAGCTAGAG 
GCTAAGCTTAGTAATAATATTAGTAAAAAGGTGTCTATTTCAGAAAAGAGGGAGTTACAA 
CTTAAAGACTTCAGCTGGGCTACCAATGGACTCCATCGGTTATTTAGATCAAGACGAGTT 
CTTTCATATTCATACCCTTTCGCATTT^ 

ATGAGCTCTGAGGAAAGAGAAATAAAACAAAATCTGTTTGAGGATCAGCAGCAGCAGCTT 
GAGGCTAATGTTGAGAAACTTTCTAAGTTCTTGGAGGAACCTTTTGATCAATTTGCTGAT 
GATAAGGTCATG(^GATAAGGATTC^^GT<^TCAATTTGT<^GTTGCGGTCGATACACTC 
TGCGAAAATATGTATGAATGCATTGAGAATGACTTGTTGGGTTCTCTGCAACTTGGCATC 
CACAACATTACTCCATACAGATCAAACGGCATAGAACGAGCATCTGATTTTTATAGTTCC 
CAGAATTCCAAGGAAGCTGTTGGTCAGTCTTCGGATTGTGGATGGACGTCCAGGCTCGAT 
CAAGCTTTGGAGTCAGGGAAGTCGGAAGACACAAGTTGCTCTTCCGGGAAGCGTGCTAGA 
ATAGACGAAAGTTA(^GAAACAGCCAAACCACCTTACTAGATTTAAACTTGCCAGCGGAA 
GCCATTGAGCGGAAATGAACACTTATCCTTCTTCACCTCCCAATAACACCCTTTTTGTCC 
AAATAAAGTGTGTTACCCGGATATTTATAGCTCTAAACCCAATCCCCTCTGCTTAATTTG 
TCAGTGACCTTACCTAACCCTCTTCA 

>G1777 Amino Acid Sequence (domain in AA coordinates : 124-247) 
MDDNLSGEEEDYYYSSDQESLNGIDOTESVSIP 

MELLSVKEHHARTLLIHYRWDVEKLFAVLVEKGKDSLFSGAGVTLLENQSCDSSVSGSSS 

MMS CDI CVED VPGYQLTRMDCGHS FCNNCWTGHFTVKINEGQS KR 1 1 CMAHKCNAI CDED 

VVRALVSKSQPDLAEKFDRFLLESYIEDNKMVKWCPSTPHCGNAIRVEDDELCEVECSCG 

LQFCFSCSSQAHSPCSCVMWELVmKKCFDESETVNWITVHTKPCPKCHKPVEK^ 

TCLCRQSFCWLCGEATGRDHTWARISGHSCGRFQEDKEK^^ 

HIDSSKLEAKLSNNISK3CVSISEKRELQLKDF 

GDELFKDEMSSEEREIKQNLFEDQQQQLEANVEKLSKFLEEPFDQFADDKVMQIRIQVIN 
LSVAVDTLCENNYECIENDLLGSLQLGIHNITPYRSNGIERASDFYSSQNSKEAVGQSSD 
CGWTSRLDQALESGKSEDTSCSSGKRARIDESYRNSQTTLLDLNLPAEAIERK* 
>G1793 {59.. 1783) 

AGTGATTTATTGATTAACCC^AACACAAAATAAACAGATTTGACTCAAAAAGAAGAAAAT 
GAATTCTAACAACTGGCTTGGCTTTCCTCTTT(^CCGAACAACTClTCTTTGCCTCCTCA 
TGAATACAACCTTGGCTTGGTCAGCGACCATATGGACAACCCTTTTCAAACACAAGAGTG 
GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 
CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 
CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 
CGTTGTAGCAGCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 
GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 
TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 
CGTTGTTGAGACGGCCACGCC^GACGTGCATTGGAC^CTTTCGGACAACGAACCTCGAT 
CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 
TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 
CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 
AACTACTACTAATTTCCCCATTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 
GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 
TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 
CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGT^AGAAGCAGC 
AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 
GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 
CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 
GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 
CACCTCATCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAACAACCATTAGAGCC 
TTTTCTATCTCTTCAGAACAATGACATCTCTCATTACAACAACAACAATGCTCACGATTC 
CTCCTCTTTTAATCACCATAGCTATATCCAGACACAACTTCATCTCCACCAACAGACCAA 
CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTCAGCAGCTCTACAATGCGTATCTTCA 
TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 
CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 
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TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 

GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-34 

MNSl^LGFPLSPNNSSLPPHEYNLGLVSDH^ 

ADFLGVSKPDENQSNHLVAYNDSDYYFHTNSLMPSV^^ 

ESAHNLQSLTLSMGTTAGNNVVDKASPSETT^ 

IYRGOTRHRWTGRYEAHLWDNSCRREGQSRKGRQVTLGGYDKEDKAARSYDLAALKTO 
STTTNFPIT^EKEVEEMKHMTRQEFVAAIRRKSSGFSRGASMYRGVTRHHQHGRWQARI 

GRVAGNKDLYLGTFSTEEE AAEAYD IAAI KFRGLNAVTNFE INRYDVKAI LES STLP I GG 
GAAKRLKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

PFLSLQNITOISHYNN1WAHDSS 

HSNPALLHGLVSTSIVDNNNNNGGSSGSYNTAAFIX5mGIGIGSSSTVGSTEEFPTVKTD 

YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 

>G180 (54.. 629) 

GTAATTACGATCTACAACAAGTGACATCGTCGTCGACGACGATTCAAGAGAATATGAACT 
TCCTCGTTCCTTTTGAAGAAACGAATGTCTTAAC^^ 

CTCTTTCTTCTCCTTCTTTCCCCATTCACAACTOTTCCTCCACTACTACTACTCATGCAC 

CTCTAGGGTTTTCTAATAATCTTCAGGGTGGAGGACCCTTGGGATCAAAGGTGGTTAATG 

ATGATCAGGAGAATTTTGGAGGTGGAACTAACAATGATGCTCATTCTAATTCTTGGTGGA 

GATCAAATAGTGGAAGTGGAGATATGAAGAACAAAGTGAAGATAAGGAGGAAACTAAGAG 

AGCCAAGATTCTGTTTCCAAACCAAAAGCGATGTTGATGTTCTTGACGATGGCTACAAAT 

GGCGTAAATATGGTCAGAAAGTCGTCAAGAACAGCCTTCACCCCAGGAGTTATTACAGAT 

GCACACACAACAACTGTAGGGTGAAAAAGAGAGTGGAGCGACTATCGGAAGATTGTAGAA 

TGGTGATTACTACTTACGAAGGTCGTCACAACCACATTCCCTCTGATGACTCCACTTCTC 

CTGACCATGATTGTCTCTCTTCCTTTTAACA 

TTATATGTGCACATATAGATGTGTGATATATTGC^ 

AGAGTATGTC^TCAGATGTTATGCATATATTCTTGACTTGTTGCTTATAGTATACATATG 

TAATAATATATATTGACATTGGTAGTTCATTTCTGTTCAAACAAAAAAAAAAAAAAA 

>G180 Amino Acid Sequence (domain in AA coordinates: 118-174) 

MNFLVTFEETNVIjTFFSSSSSSSLSSPSFPIHNSSSTTTT^ 

VNDDQENFGGGTNXTOAHSNSWWRSNSGSGDMKN^^ 

YKWRKYGQKWKNSLHPRSYYRCTHimCRvT^ 

TSPDHDCLSSF* 

>G192 (63.. 959) 

CTTTTTTCTCTTCTCTCCTCAGAGATTCGAAGCTTTTTGTCTCCCCTGAGTAACCAAATT 
CAATGGCCGACGATTGGGATCTCCACGCCGTAGTCAGAGGCTGCTCAGCCGTAAGCTCAT 
CAGCTACTACCACCGTATATTCCCCCGGCGTTTCATCTCACACAAACCCTATATTCACCG 
TCGGACGACAAAGTAATGCCGTCTCCTTCGGAGAGATTCGAGATCTCTACACACCGTTCA 
CACAAGAATCTGTCGTCTCTTCGTTTTCXTGTATAAACTACCCAGAAGAACCTAGAAAGC 
CACAGAACCAGAAACGTCCTCTTTCTCTCTCTGCTTCTTCCGGTAGCGTCACTAGCAAAC 
CCAGTGGCTCCAATACCTCTAGATCTAAAAGAAGAAAGATACAGCATAAGAAAGTGTGCC 
ATGTAGCAGCAGAAGCTTTAAACTCCGATGTCTGGGCATGGCGAAAGTACGGAC^GAAAC 
CCATCAAAGGTTCACCATATCCAAGAGGATACTACAGATGTAGTACATCAAAAGGTTGTT 
TAGCCCGTAAACAAGTGGAGCGAAATAGATCCGACCCGAAGATGTTTATCGTCACTTACA 
CGGCGGAGCATAATCATCCAGCTCCGACACACCGTAATTCTCTCGCCGGAAGCACACGTC 
AGAAACCATCCGATCAACAGACGAGTAAATCTCCGACGACCACTATTGCTACTTATTCAT 
CGTCTCCGGTGACTTCAGCCGACGAATTTGTTTTGCCTGTTGAGGATCATCTAGCGGTGG 
GAGATCTTGACGGAGAAGAAGATCTGTTATCTTTGTCGGATACGGTGGTTAGCGATGATT 
TCTTCGATGGGTTAGAGGAATTCGCAGCCGGAGATAGCTTTTCCGGGAACTCGGCTCCGG 
CGAGTTTTGATCTCTCTTGGGTTGTGAACAGTGCCGCCACTACCACCGGAGGAATATGAT 
TAGATTACGACGGCTTAGAATACTCTTATTAGGACAGATTTATAGGATTAAGGAATTATT 
CTCGGAGCATATGTAAAT^TAGGATAAAAGAAAATGTTCTTTGTTACTTTTTTTCGGGTT 
TTCTTCCTATTGTTTCTAAACATCTTAGAAAAAATTTAATTGTATATTCCTTAAGCTCGA 
TACATCTTGTTTTAAAAAAAAAAAAAAAAAA 

>G192 Amino Acid Sequence (domain in AA coordinates: 128-185) 
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MADDWDLHAVVRGCSAVSSSATTTVYSPGVSSHTNPIFTVGRQSNAVSFGEIRDLYTPFT 
QESWSSFSCINYPEEPRKPQNQKRPLSLSASSGSVTSKPSGSNTSRSKRRKIQHKKVCH 
VAAEALNSDVWAWRKYGQKPIKGSPYPRGYYRCST^^ 

AEHNHPAPTHRNSLAGSTRQKPSDQQTSKSPTTTIATYSSSPWSADEFVLPVEDHLAVG 
DLDGEEDLLSLSDTWSDDFFDGLEEFAAGDSFSGNSAPASFDIiSWWNSAATTTGGI* 
>G1948 (18. .1118) 

AAAAGGTCTTCTTGGCCATGGATACTTGTGCTCTAGTAATCCATCAGTCTCTGTCTCGCA 
TCAAACTTTCTCCTCCCAAATCTTCTTCTTCTTCTTCTTCTGCTTTCTCCCCTGAATCCT 
TACCGATCAGACGGATCGAGCTGTGTTTCCGAGGAGCTATATGTGCCGCCGTACAAAGAA 
ACTACGAAGAAACGACCTCCTCCGTGGAAGAGGCAGAGGAAGATGATGAGTCATCATCAT 
CGTACGGAGAAGTGAACAAGATCATTGGAAGCCGAACGGCGGGGGAAGGAGCCATGGAGT 
ACCTTATCGAGTGGAAGGACGGCCATTCTCCGTCGTGGGTTCCATCGAGCTACATCGCAG 
CAGACGTAGTGTCGGAGTACGAGACACCCTGGTGGACGGCAGCTAGAAAAGCCGACGAGC 
AGGCCCTGTCACAGCTCCTGGAGGACCGAGACGTCGATGCCGTGGACGAAAACGGCCGGA 
CGGCTCTGCTTTTCGTGGCAGGTCTGGGGTCGGACAAGTGCGTAAGGCTTCTGGCGGAGG 
CTGGAGCCGATCTCGACCACCGAGACATGAGGGGAGGCTTGACGGCGCTGCACATGGCGG 
CTGGTTACGTGAGGCCGGAGGTGGTGGAGGCGCTGGTGGAGCTGGGAGCTGATATTGAAG 
TGGAAGACGAGAGAGGGTTAACGGCGTTGGAACTAGCGAGGGAGATTCTGAAGACGACGC 
CGAAGGGGAATCCGATGCAGTTCGGGAGGAGAATTGGGTTAGAGAAAGTGATCAATGTCC 
TGGAAGGACAAGTGTTCGAGTACGCCGAGGTGGATGAGATCGTAGAGAAACGAGGGAAAG 
GCAAAGACGTTGAATATCTGGTCAGATGGAAGGACGGTGGAGATTGCGAGTGGGTGAAAG 
GTGTACACGTGGCGGAAGATGTGGCTAAGGACTACGAGGATGGGCTGGAGTACGCTGTAG 
CGGAGAGTGTGATCGGGAAGAGGGTGGGAGACGATGGGAAGACCATCGAGTATCTTGTCA 
AATGGACTGATATGTCTGATGCCACTTGGGAGCCTCAGGACAATGTCGACTCTACTCTTG 
TTCTACTCTACCAACAAC^U^CAACCAATGAATGAATGATTGATTTTGATGATTACATTCT 
TCTCAATTTGCTTCTTTCTCATATGTGTTGGTTCATCTGACCGGTTCGGTTGGTACGTAC 
CGGTACATTTTC^TTTTCTTTT^ 

CTATTTGATTTTATATCCATGCTTTGAATTTTGCTTCCCTTTTTGGGGAGATTCATGAAA 

*>G1948 Amino Acid Sequence (domain in AA coordinates: entire protein) 

MDTCALVIHQSLSRIKLSPPKSSSSSSSAFSPESLPIRRIELCFRGAICAAVQRNYEETT 

SSVEEAEEDDESSSSYGEVNKIIGSRTAGEGAMEYLIEWKDGHSPSWVPSSYIAADWSE 

YETPWWTAARKADEQALSQLLEDRDVDAVDENGRT^ 

HRDMRGGLTALHMAAGYVRPEVVEALVELGAD I EVEDERGLTALELAREILKTTPKGNPM 
Q FGRRIGLEKVINVLEGQVFE YAEVDEI VE KRGKGKDVEYLVRWKDGGDCEWVKGVHVAE 
DVAKDYEDGLEYAVAESVIGKRVGDDGKTIEYLVKWTDMSDATWEPQDNVDSTLVLLYQQ 
QQPMNE* 

>G2123 (1..657) 

ATGAGAAAAGTATGTGAGCTTGATATAGAGCTAAGTGAAGAGGAAAGAGACCTACTAACA 

ACTGGATACAAGAATGTCATGGAGGCTAAGAGAGTTTCATTGAGAGTAATATCATCCATT 

GAAAAAATGGAAGACTCGAAAGGAAACGACCAAAATGTGAAACTGATAAAAGGACAACAA 

GAAATGGTTAAATATGAGTTTTTCAATGTTTGTAATGACATTTTGTCTCTCATTGATTCT 

CATCTCATACCATCAACTACTACTAATGTCGAATCAATTGTCCTTTTTAACAGAGTGAAA 

GGAGATTATTTTCGATATATGGCAGAGTTTGGTTCTGATGCTGAACGTAAAGAAAATGCA 

GATAATTCTCTAGATGCATATAAGGTTGCAATGGAAATGGCAGAGAATAGTTTAGCACCC 

ACCAATATGGTTAGACTTGGATTGG'CTTTAAATTTCTCGATATTC?VATTATGAGATCCAT 

AAATCTATTGAAAGCGCATGTAAATTGGTTAAGAAAGCTTACGATGAAGCAATCACTGAA 

CTCGATGGCCTTGACAAGAATATATGCGAAGAGAGCATGTATATCATAGAGATGCTTAAA 

TACAATCTTTCTACQTGGACTTCAGGCGATGGTAATGGTAATAAGACAGACGGTTAG 

>G2123 Amino Acid Sequence (domain in AA coordinates : 99-109) 

MRKVCELDIELSEEERDLLTTGYK^^VM 

EMVKYEFFNVCNDILSLIDSHLIPSTTTNVESIVLFNRVKGDYFRYMAEFGSDAERK^ 
DNSLDAYKVAMEMAENSIAPTI^ 

LDGLDKNICEESMYIIEMLKYNLSTWTSGDGNGNKTDG* 
>G2138 (27.. 512) 

GGAACCCTAATTTCCGCAAATTCACTATGAAGCGTATTATCAGAATCTCATTCACCGACG 
CAGAAGCCACCGATTCTTCTAGCGACGAAGACACGGAGGAGCGTGGAGGAGCATCCCAGA 
CTCGGCGCCGTGGGAAACGCCTCGTTAAAGAGATCGTAATCGATCCTTCCGATTCCGCCG 
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ATAAACTCGATGTCTGCAAAACACGGTTC?^AAATCAGGATCCCGGCGGAATTTCTCAAGA 

CGGCGAAAACGGAGAAGAAATATCGTGGAGTGAGGCAGAGGCCGTGGGGGAAGTGGGTGG 

CGGAGATCAGATGTGGAAGAGGAGCTTGTAAAGGACGACGTGATCGTCTCTGGCTGGGTA 

CTTTTAACACTGCTGAGGAAGCTGCTCTAGCTTATGATAACGCTTCAATTAAGCTGATTG 

GACCTCACGCGCCGACCAATTTTGGTTTGCCGGCGGAGAATCAAGAGGATAAGACGGTGA 

TTGGAGCTTCTGAGGTTGCTAGAGGCGCGTGAAGTGGGGTTGGTAATTTAGTTGTTAGC 

>G2138 Amino Acid Sequence (domain in AA coordinates: TBD) 

MKRI IRIS FTD AEATDS SSDEDTEERGGASQTRRRGKRLVKE I VIDPSDS ADKLDVCKTR 

FKIRIPAEFLKTAKTEKKYRGVRQRPWGKWVAEIRCGRGACKGRRDRLWLGTFNTAEEAA 

LAYDNAS IKLIGPHAPTNFGLPAENQEDKTVI GASEVARGA* 

>G2139 (40.. 663) 

CCTACAAGAAATCAAACACTAGTTCTGGTTTCTGCAAACATGTCATCTACGAAGCAAGCA 
AAGGGAAGAAAAACAAAGGGGAAGCAAAAGATCGAGATGAAGAAGGTGGAGAACTATGGA 
GATAGGATGATTACGTTCTCAAAACGTAAAACCGGAATTTTTAAGAAAATGAACGAGCTC 
GTAGCAATGTGTGACGTTGAAGTGGCTTTCTTGATTTTCTCTCAACCCAAGAAGCCCTAT 
ACATTCGCACATCCGTCTATGAAGAAAGTGGCTGACCGGTTAAAGAACCCTTCGAGACAA 
GAACCATTAGAGAGAGACGATACCAGACCCCTCGTCGAAGCTTATAAGAAACGAAGGCTC 
CACGACCTCGTAAAAAAAATGGAGGCGCTCGAAGAGGAGCTTGCGATGGATCTAGAGAAG 
TTGAAACTGTTGAAGGAATCGAGAAATGAAAAGAAGTTAGATAAAATGTGGTGGAACTTT 
CCTTCGGAAGGTTTGAGCGCGAAGGAGCTGCAGCAAAGGTACCAAGCGATGCTCGAGTTA 
CGTGATAACTTATGCGACAATATGGCTCACTTACGATTGGGAAAAGACTGTGGTGGTTCA 
TCTTCTGTTCGTGTGGGACGTCGAGTTTCTGGTGGTGTTCGTCTGTTCGATCGTGAAGCA 
TGATCATACATATTCATACTTGATGATTT^ 

ATACTGCATGTATCCATTTGACGAAGCTCAATCGTCTCGAGTATATCTCTATTATCTAAC 
AGTATTGAGAAAAAAGGAGTTTCAGTAAAAAAAA7UU\AAAAAAAAAAA 

>G2139 Amino Acid Sequence (conserved domain in AA coordinates : 14-69) 

MS STKQAKGRKTKGKQKI EMKKVENYGDRM ITFS KRKTGI FKKMNEL VAMCD VT3VAFL I F 

SQPKKPYTFAHPSMKKVADRLKNPSRQEPLERDDTRPLVEAYKKRRLHDL 

LAMDLEKLKLLKESRNEK3CLDKMWWFPSEGLSAKELQQRYQAMLELRD 

GKDCGGSSSVRVGRRVSGGVRLFDREA* 

>G2343 (X..1113) 

ATGGGTCATCACTCATGCTGCAACCAGCAAAAGGTGAAGAGAGGGCTTTGGTCACCGGAA 
GAAGATGAGAAGCTTATTAGATATATCACAACTCATGGCTATGGATGTTGGAGTGAAGTC 
CCTGAAAAAGCAGGGCTTCAAAGATGTGGAAAAAGTTGTAGATTGCGATGGATAAACTAT 
CTTCGACCTGATATCAGGAGAGGAAGGTTCTCTCCAGAAGAAGAGAAATTGATCATAAGC 
CTTCATGGAGTTGTGGGAAACAGGTGGGCTCATATAGCTAGTCATTTACCGGGAAGAACA 
GATAACGAGATTAAAAACTATTGGAATTCATGGATTAAGAAAAAGATACGAAAACCGCAC 
CATCATTACAGTCGTCATCAACCGTCAGTAACTACTGTGACATTGAATGCGGACACTACA 
TCGATTGCCACTACCATCGAGGCCTCTACCACCACAACATCGACTATCGATAACTTACAT 
TTTGACGGTTTCACTGATTCTCCTAAC CAATTAAATTTCAC CAATGATCAAGAAACTAAT 
ATAAAGATTCAAGAAACTTTTTTCTCCCATAAACCTCCTCTCTTCATGGTAGACACAACA 
CTTCCTATCCTAGAAGGAATGTTCTCTGAAAACATCATCACAAACAATAACAAGAACAAT 
GATCATGATGACACGCAAAGAGGAGGAAGAGAAAATGTTTGTGAACAAGCATTTCTAACA 
ACTAACACGGAAGAATGGGATATGAATCTTCGTCAGCAAGAGCCGTTTCAAGTTCCTACA 
CTGGCGTCACATGTGTTCAACAACTCTTCCAATTCAAATATTGACACGGTTATAAGTTAT 
AATCTACCGGCGCTAATAGAGGGAAATGTCGATAACATCGTCCATAATGAAAACAGCAAT 
GTCCAAGATGGAGAAATGGCGTCCACATTCGAATGTTTAAAGAGGCAAGAACTAAGCTAT 
GATCAATGGGACGATTCACAACAATGCTCTAACTTTTTCTTTTGGGACAACCTTAATATA 
AACGTGGAAGGTTCATCTCTTGTTGGAAACCAAGACCCATCAATGAATTTGGGATCATCT 
GCCTTATCTTCTTCTTTCCCTTCTTCGTTTTAA 

>G2343 Amino Acid Sequence (domain in AA coordinates: 14-116) 

MGHHSCCNQQKVKRGLWSPEEDEKLIRYITTHGYGCWSEVPEKAGLQRCGKSCRIiRWINY 

LRPDIRRGRFSPEEEKLIISLHGWGNRWAHIASHLPGRTDNEIKNYWNSWIKKKIRKPH 

HHYSRHQPSVTTVTLNADTTSIATO^ 

IKIQETFFSHKPPLFMVDTTLPILEGMFSE3mTN^ 

TNTEEWDMNLRQQEPFQVPTLASHVFl!^ 

VQDGEMASTFECLKRQELSYDQWDDSQQCSNFFFWDNLNINVFIGSSLVGNQDPSMNLGSS 
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ALSSSFPSSF* 
>G26S (280.. 1317) 

CTTTGGTCTTGGAAGCGAAATCAAACCTTTC^ 

TCTTTTGCTTTACGTTCTCTCAATTCTTATTTGTAAGAAAGTTTGTTCCTTTAA 

AAATCAAAGAGACTTTTGAAGATTGTTTCCCAATTTGCGTCAATCGGGATCGAGTCAAAT 

CTGAAATCTTCTCCACTCATCATCTGACTATAAGACTTAATCAAGGGACTTTTTGTTCGG 

GTTTGGTTTTAAACGTCTTGGATTCGAAGTGGTTAAGGTATGGATGAAAATAATGGAGGT 

TCAAGCTCACTTCCACCTTTCCTTACTAAAACATATGAAATGGTTGATGATTCTTCTTCT 

GACTCGGTCGTTGCTTGGAGCGAAAACAACAAAAGCTTCATCGTCAAGAATCCAGCAGAG 

TTTTCAAGAGACCTTCTTCCGAGATTCTTCAAGCATAAGAATTTCTCAAGTTTCATCCGT 

CAGCTTAATACATATGGTTTTCGAAAAGTAGATCCTGAGAAATGGGAATTCTTGAATGAT 

GATTTTGTTAGAGGTCGACCTTACCTTATGAAGAACATTCATAGACGAAAACCGGTTCAT 

AGCCACTCGTTAGTGAATCTACAAGCGCAAAATCCTTTGACGGAATCAGAAAGACGGAGC 

ATGGAGGATCAGATAGAAAGACTGAAAAATGAGAAAGAAGGCCTTCTTGCGGAGTTACAG 

AACCAAGAGCAAGAACGGAAAGAGTTTGAGCTGCAAGTAACGACATTGAAAGATCGGTTA 

CAACATATGGAACAACATCAGAAATCAATAGTGGCATATGTTTCACAGGTTTTGGGAAAA 

CCAGGACTTTCACTAAACCTCGAAAACCATGAGAGAAGAAAAAGAAGATTTCAAGAGAAC 

TCTCTTCCTCCAAGCAGTTCACACATAGAACAGGTCGAAAAGTTAGAATCTTCGCTAACG 

TTTTGGGAGAATCTTGTATCGGAATCATGCGAGAAGAGCGGTTTGCAGTCATCAAGCATG 

GATCATGATGCAGCTGAGTCAAGTCTAAGTATTGGCGATACACGACCCAAATCATCGAAG 

ATTGATATGAACTCAGAGCCGCCCGTTACCGTTACTGCGCCTGCTCCAAAAACAGGCGTT 

AACGATGACTTTTGGGAACAATGTTTGACAGAGAACCCTGGATCAACCGAGCAACAAGAA 

GTTCAGTCAGAGAGAAGAGATGTCGGTAATGATAAT7^ATGGTAATAAGATTGGAAATCAA 

AGGACGTATTGGTGGAATTCAGGGAATGTAAATAACATTACAGAGAAAGCTTCTTGACAT 

GAATGAGGTTTTTGTAAAATAGTTTTCTTTTGGTTCCACTGAGATTATTGTATGTGTTCA 

TTATTTATTACTCTGTTTCTGTAAAAACAAATCTCTCTATTGTTTGAGGCAGGAGTGACA 

TAAATGCATATGCAGAATTGGTTTCAAAAA 

>G265 Amino Acid Sequence (domain in AA coordinates: 11-105) 
MDENNGGS SSLPPFLTKTYEMVDDS S SDS WAWSENNKS F I VKNPAEFSRDLLPRFFKHK 
NFSSFIRQLNTYGFRK\nDPEKWEFLNDDFVRGRPYLMKNIHRRKP^ 
TESERRSMEDQIERLKNEKEGLLAELQNQEQERKEFELQVTTLKDRLQHMEQH 
VSQVLGKPGLSLNLENHERRKRRFQENSLPPSSSHIEQVEKLESSLTFWENLVSESCEKS 
GLQSSSMDHDAAESSLSIGDTRPKSSKIDMNSEPPVTVTAPAPKTGVNDDFWEQCLTENP 
GSTEQQEVQSERRDVGNDNNGNKIGNQRTYWWNSGNVNNITEKAS * 
>G2792 (1..960) 

ATGGATCATCATCATCACATAGCATCAAGAAATTCATCAACAACATCAGAATTACCATCA 

TTCGAGCCAGCGTGCCATAACGGTAATGGTAACGGTTGGATCTATGACCCAAATCAAGTT 

AGGTACGATCAAAGTAGTGACCAACGGCTGTCAAAGTTGACGGATCTTGTAGGCAAGCAC 

TGGTCAATTGCACCACCGAATAATCCCGACATGAACCATAACCTTCATCATCACTTCGAT 

CATGATCATTCTGAAAACGACGACATTTCTATGTACAGACAAGCCTTGGAGGTGAAAAAT 

GAGGAAGATCTTTGTTACAATAATGGCTCAAGTGGTGGTGGTTCCTTGTTCCATGATCCT 

ATAGAAAGTTCTAGAAGTTTCCTTGATATAAGGTTAAGTAGGCCATTAACGGATATTAAT 

CCGTCATTTAAGCCATGCTTTAAGGCCTTAAACGTATCCGAGTTTAACAAGAAAGAACAT 

CAAACGGCATCTCTGGCAGCAGTGAGACTGGGAACAACAAACGCTGGAAAAAAGAAGAGA 

TGTGAAGAAATTTCCGATGAGGTTTCAAAGAAGGCCAAGTGCAGTGAGGGCTCTACACTT 

TCGCCAGAGAAGGAACTACCCAAAGCCAAACTTCGAGACAAGATCACGACTCTACAGCAA 

ATTGTGTCTCCCTTTGGAAAGACTGATACTGCTTCTGTGCTTCAAGAGGCCATCACTTAC 

ATAAATTTTTATCAAGAGCAAGTTAAGCTGCTAAGCACTCCTTATATGAAGAATTCATCA 

ATGAAGGATCCATGGGGGGGATGGGACAGAGAAGATCACAACAAAAGGGGACCGAAGCAT 

CTTGATCTAAGGAGTAGAGGGCTTTGTTTGGTTCCTATTTCATATACCCGAATCGCATAC 

CGCGATAACAGTGCAACTGACTACTGGAATCCCACGTATAGAGGTTCTTTGTATCGTTAG 

>G2792 Amino Acid Sequence (domain in AA coordinates : 190-258) 

^HHHHIASRNSSTTSELPSFEPACHNGNGNGWIYDPNQWYDQSSDQRLSKLTDLVGKH 

WSIAPPNNPDMNHNLHHHFDHDHSQ^DISMYRQALEVKNEEDLCYIJNGSSGGGSLFHDP 

IESSRSFLDIRLSRPLTDINPSFKPCFKALNVSEFNKKEHQT^ 

CEEISDEVSKKAKCSEGSTLSPEKELPKAKLRDKITTLQQIVSPFGKTDTASVLQEAITY 
INFYQEQVKLLSTPYMKNSSMKDPWGGWDREDHNKRGPKHLDLRSRGLCLVPISYTPIAY 
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RDNSATDYWNPTYRGSLYR* 
>G2830 (1..903) 

ATGTCTTCCATCCCAAATAGGTTCAATATTTATGGTGGTGATACCACAAACCATCGTGAA 

TCGCTTCCCATCGAAATGAATCACAACTCTCGAATGGTTCGATCCATGTTCATTACATCT 

GATCGCATGAATCATAGAGATTTGTTTTCT^CTCCTCCTTCTTTCTCTTCTTATCAAAAT 

TCACATATCTCTTCATCTTCTGTTGGGTTTAATAATTCACATATGACTTATCATATGCTG 

AAAAGAAATTATGATTCTGTTTCCCGTGCTGATTATTTCTCTACTAAAGATCATTCTCAT 

TTTACTCAAGTATCTTTCACTCAAACCATCACAAATAAGTATACTACTATTGTTCCTTCC 

AATATATTTGACACTGTTCACTATGATATTGGTCGTGTCAAACGTGCCATAGATTTO 

AATATTTGGAATCCTAAATCTCATCTTCCAAAAAAATTTAATAGGCAATGCGAGATTTTG 

T^ATCCTACCCCTCTTAATATCGTCTTTCCGCACCAGGATTCAGCTGATCGTCAACATTTA 

GACATTATTTTCTCGTCATCAAAGCACAACCATGT^ 

AAAATTTCCGAACCAACCAATCTGTTTGAAAAATCTAATTCTTATGATTCTCAAGAAGAT 
GAGAAAATCGATGCTTATCAATATGATGGTCGTACACATAGTCTACCGTATACGAAATAC 
GGTCCATATACATGTCCCAGGTGTAACGGTGTGTTTGATACTTCTCAAAAATTTGCTGCA 
CATATGTTATCTCACTACAATAATGAGACGGACAAAGAAAGAGACCAAAGATTTCGTGCA 
AGAAATAAAAAACGATATCGTAAGTTTATGGACAGTCTTAAAATATCAAZ^ACAGAAGATA 
TGA 

>G2830 Amino Acid Sequence (domain in AA coordinates : 245 -2 66) 
MSSIPNRFNIYGGDTTNHRESLPIEMNHNSRMTOSMFITSDRMNHRDLFSSPPSFSSYQN 
SHISSSSVGFNNSHMTYHMLKRNYDSVSRADYFST^ 
NIFDTVHYDIGRVKRAIDFRNIWNPKSHLPKKFN^^ 

DIIFSSSKHNHVFQDGRSLKKISEPTNLFEKSNSYDSQEDEKIDAYQYDGRTHSLPYTKY 
GPYTCPRCNGVTOTSQKFAAHMLSHYl^ETDKERDQR 

>G286 (94..2454) 

TGCAATTTCTCTCGACCAAAACCCTAATTTCAGGTTTGGGGTTTTCCTTCTTTCACTGTC 
AATTTTGATGAAACTTGTGATTCAGTGATT^ 

GCCAATGGCATTGGCAATGGCAATGGTGAGTCTATTCCCGGGATTCCAGATGACTTACGG 
TGCAAGAGATCGGATGGTAAACAGTGGAGATGCACTGCAATGTCCATGGCTGATAAGACT 
GTTTGTGAGAAGCACTACATCCAAGCAAAGAAGCGGGCGGCTAATTCTGCTTTCAGGGCG 
AACCAGAAGAAAGCGAAAAGGCGATCATCGTTAGGCGAAACAGATACGTATTCGGAAGGG 
AAGATGGATGATTTCGAGTTACCAGTCACCAGCATTGACCACTATAATAACGGTCTTGCC 
TCTGCTTCCAAGAGTAATGGTAGACTAGAGAAGAGACATAATAAAAGCCTGATGCGGTAC 
TCGCCCGAGACACCGATGATGAGGAGTTTCTCTCCACGTGTTGCAGTGGATTTGAATGAT 
GACTTGGGTAGAGATGTTGTAATGTTTGAAGAGGGCTACAGATCTTATAGGACACCACCA 
TCTGTTGCTGTTATGGATCCGACACGAAACAGATCACACCAAAGCACCAGTCCTATGGAA 
TACTCAGCAGCAAGCACAGATGTGTCTGCAGAGTCTTTGGGGGAAATCTGCCATCAATGC 
CAGAGAAAAGATAGAGAGAGAATCATTTCTTGCCTCAAATGCAATCAAAGAGCCTTCTGC 
CACTUVTTGTCTATCGGCAAGGTACTCGGAGATATC^CTTGAAGAAGTCGAGAAAGTTTGC 
CCTGCATGTCGTGGCTTGTGTGATTGCAAATCTTGCCTGCGTTCAGATAATACAATAAAG 
GTTCGGATCCGGGAT^ATACCCGTTTTGGACAAGTTGCAGTATCTTTATCGTCTATTATCA 
GCTGTCCTACCAGTCATAAAGCAGATCCATCTTGAACAATGTATGGAAGTTGAACTAGAG 
AAGAGGCTTCTTGAAGTTGAGATTGATCTTGTCAGGGCAAGATTGAAAGCAGATGAGCAG 
ATGTGCTGCAACGTGTQTCGGATACCAGTTGTTGACTACTACCGTCACTGTCCGAACTGC 
TCATATGACCTTTGCCTGAGATGCTGTCAAGATCTACGGGAAGAGTCTTCAGTGACGATT 
AGTGGGACTAACCAAAACGTACAAGATAGAAAAGGAGCTCCCAAACTAAAACTAAACTTT 
TCATACAAGTTTCGTGAGTGGGAAGCCAACGGTGATGGGAGCATCCCTTGCCCTCCTAAG 
GAGTATGGAGGCTGCGGTTCACATTCTTTGAATCTTGCCCGCATTTTCAAGATGAATTGG 
GTTGCAAAGCTTGTGAAAAATGCTGAGGAGATTGTTAGTGGCTGCAAATTATCTGATCTT 
CTGAACCCTGATATGTGTGATTCAAGATTCTGCAAATTTGCTGAGAGAGAAGAGAGCGGT 
GACAACTACGTGTACAGCCCGTCGCTTGAAACGATTAAAACTGATGGAGTAGCTAAGTTT 
GAGCAACAATGGGCAGAGGGTCGGCTTGTTACTGTGAAAATGGTACTTGATGACTCATCT 
TGGTCTAGATGGGATCCTGAGACTATTTGGAGGGATATAGACGAGCTTTCGGACGAGAAA 
CTGAGAGAACATGATCCATTCTTGAAGGCCATTAATTGCTTGGATGGTTTAGAGGTTGAT 
GTAAGACTTGGGGAGTTTACAAGAGCATATAAAGATGGAAAGAACCAAGAGACAGGTCTT 
CCGCTATTGTGGAAGTTAAAGGACTGGCCGAGCCCAAGTGCTTCCGAGGAGTTCATTTTC 
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TACCAAAGACCTGAGTTTATCAGAAGTTTTCCGTTTCTCGAGTACATTCATCCCCGGTTA 
GGCCTTCTGAATGTTGCAGCCAAGTTACCTCATTACTCGCTCCAAAACGATTCAGGTCCA 
AAGATTTATGTGTCTTGTGGGACGTACCAAGAAATCAGTGCTGGCGATTCATTGACTGGT 
ATTCACTACAACATGCGTGACATGGTATACCTATTGGTGCACACGTCTGAAGAAACAACA 
TTCGAAAGGGTGAGAAAAACAAAACCTGTTCCAGAGGAACCTGACCAGAAGATGAGCGAA 
AATGAGTCACTTCTTAGCCCTGAGCAGAAATTAAGGGACGGAGAGTTACATGATCTATCA 
CTTGGTGAAGCCAGTATGGAGAAGAATGAACCTGAGTTGGCGTTGACTGTGAATCCAGAG 
AACTTAACGGAAAACGGTGACAACATGGAATCTTCTTGCACATCTTCATGTGCAGGAGGA 
GCCCAGTGGGATGTCTTTCGACGCCAAGACGTCCCAAAGTTGTCCGGGTATTTGCAGAGA 
ACATTCCAGAAGCCTGATAATATCCAGACTGATTTTGTAAGCCGTACCTGCTAATTCAAA 
TAAATGAAGTGTGTAAAGTCTTGTATGTGGAATGATTGAGTTTCCTAGTTTGTTTACTCT 
GGTTTCAGGTGTCACGCCCGTTGTATGAAGGATTGTCTTTAAATGAACACCACAAGAGAC 
AACTAAGAGACGAGTTTGGAGTTGAGCCATGGACATTTGAGCAACATCGTGGTGAGGCTA 
TCTTCATTCCGGCTGGATGTCCGTTCCAAATCACTAATCTTCAGTCGAATATTCAGGTGG 
CACTTGACTTCTTGTGCCCTGAAAGCGTTGGAGA^^ 

GGTGTTTACCAAACGACCACGAGGGAAAACTTCAGATTCTAGAGATTGGAAAGATATCAT 
TATACGCAGCTAGCTCAC3CCATTAAAGAGGTTCAGAAACTGGTCTTGGATCCAAAGTTTG 
GAGCAGAGCTTGGATTTGAAGACTCTAACTTAACCAAAGCAGTCTCTCACAACTTAGACG 
AGGCAACCAAGCGGCC 

>G286 Amino Acid Sequence (domain in AA coordinates: TBD) 
MNANEQTRSANGIGNGNGESIPGIPDDLRCKRSDGKQWRCTAMSMADKTVCEKHYIQAKK 
RAANSAPRANQKKAKRRSSLGETDTYSEGKMDDFEL^ 
RHNKSLMRYSPETPMMRSFSPRVAVDLNDDLGRDVV^ 

SHQSTSPMEYSAASTDVSAESLGEICHQCQRKDRERIISCLKCNQRAFCHNCLSARYSEI 
SLEEVE KVCPACRGLCDCKSCLRSDNTIKVRIREIPVIjDKLQYLYRLLSAVLPVIKQIHIj 
EQCMEVELEKRLLEVEIDLVRARLKADEQMCCNVCRIPVVDYYRHCPNCSYDLCLRCCQD 
LREESSVTISGTNQNVQDRKGAPKLKLNFSYKFPEWEANGDGSIPCPPKEYGGCGSHSLN 
LARIFKMNWVAKLVKNAEEIVSGC^ 

IKTDGVAKFEQQWAEGRLVTVKMVLDDSSCSRWDPETIWRDIDELSDEKLREHDPFLKAI 

NCLDGLEVDVRLGEFTRAYRDGKNQETGLPLLWKLKDWPSPSASEEFIFYQRPEFIRSFP 

FLEYIHPRLGLLNVAAKLPHYSLQNDSGPKIWSCGTYQEISAGDSLTGIHYNMRDM 

LVHTSEETTFERVRKTKPVPEEPDQKMSENESLLSPEQKLRDGELHDLSLGEASMEKNEP 

EIiALTVNPENLTENGDNMESS CTS S CAGGAQWDVFRRQDVPKLSGYLQRTFQKPDN I QTD 

FVSRTC* 

>G291 (124.. 1197) 

CAAGAACCCAAAGATCTCTCTCTATTTGTTT^ 

TCAAATCAATTCTCGCGATTAAGCAAAACCCTAGATTTATTCTACTCTTCGAAGTCGATT 
TCAATGGAAGGTTCCTCGTCAGCCATCGCGAGGAAGACATGGGAGCTAGAGAACAACATT 
CTCCCAGTGGAACCAACCGATTCAGCCTCCGACAGTATATTCCACTACGACGACGCTTCA 
CAAGCCAAAATCCAGCAGGAGAAGCCATGGGCCTCCGATCCTAACTACTTCAAGCGCGTT 
C^CATCTCAGCCCTTGCTCTTCTCAAGATGGTGGTTCACGCTCGCTCCGGTGGCACAATC 
GAGATCATGGGTCTTATGCAGGGTAAAACCGAGGGTGATACAATCATCGTTATGGATGCT 
TTTGCTTTGCCTGTTGTVAGGTACTGAGACTAGGGTTAATGCTCAGTCTGATGCCTATGAG 
TATATGGTTGAATACTCTCAGACCAGCAAGCTGGCTGGGAGGTTGGAGAACGTTGTTGGA 
TGGTATCACTCTCACCCTGGGTATGGATGTTGGCTCTCGGGTATTGATGTTTCGACACAG 
ATGCTTAACCAACAGTATCAGGAGCCATTCTTAGCTGTTGTTATTGATCCAACAAGGACT 
GTTTCGGCTGGTAAGGTTGAGATTGGGGCATTCAGAACATATCCAGAGGGACATAAGATC 
TCGGATGATCATGTTTCTGAGTATCAGACTATCCCTCTTAACAAGATTGAGGACTTTGGT 
GTACATTGCAAACAGTACTACTCATTGGACATCACTTATTTCAAGTCATCTCTCGATAGT 
CACCTTCTGGATCTCCTTTGGAACAAGTACTGGGTGAACACTCTTTCTTCTTCCCCACTG 
TTGGGCAATGGAGACTATGTTGCCGGGCAAATATCAGACTTGGCTGAGAAGCTCGAGCAA 
GCGGAGAGTCAGCTCGCTAACTCCCGGTATGGAGGAATTGCGCCAGCCGGTCACCAAAGG 
AGGAAAGAGGATGAGCCTCAACTCGCGAAGATAACTCGGGATAGTGCAAAGATAACTGTC 
GAGCAGGTCC^TGGACTAATGTCACAGGTTATCAAAGACATCTTGTTCAATTCCGCTCGT 
CAGTCCAAGAAGTCTGCTGACGACTCATCAGATCCAGAGCCCATGATTACATCGTGAAGT 
TGGTCTATTCTTTTGTTTTTTGGCTGCGGAAATTGACTATCGGTTTGACCCGGTTTATGA 
GGCAATGCCCATTGTTCCCTATATCTCTAGTGTAGTATCTGCTTCAGACAAAGATCTTTG 
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GGTTATTAAATGACATTAACATAAAAAAAA 

>G291 Amino Acid Sequence (domain in AA coordinates: 132-160) 

MEGSSSAIARKTWELENNILPVEPTDSASDSIFHYPDASQAKIQQEKPWASDPNYFKRVH 

ISALAIiLKKVA/HARSGGTIEIMGLMQGKTEGDTIIVMDAFALPVEGTETRVNAQSDAYEY 

MVEYSQTSKLAGRLENVVGWYHSHPGYGCWLSGIDVSTQMLNQQYQEPFLAVVIDPTRTV 

SAGKVEIGAFRTYPEGHKISDDHVSEYQTIPLNKIEDFGVHCKQYYSLDITYFKSSLDSH 

LLDLLWNKYWVNTLSSSPLLGNGDYVAGQISDLAEKLEQAESQLANSRYGGIAPAGHQRR 

KEDEPQLAKITRDSAKITVEQVHGLMSQVIKDILFNSARQSKKSADDSSDPEPMITS* 

>G427 (49.. 1230) 

TTTCCCTCTCCGAAACAGAAATTCAAAAACAAATTCAACACGAAAACGATGGCGTTTCAT 
AACAATCACTTTAATCATTTC^CCGACCAAC^CAACATCAGCCTCCTCCTCCGCCGC^ 
CAGCAGCAGCAACAACATTTTCAAGAATCAGCACCCCCTAATTGGCTCCTCCGCTCCGAC 
AACAACTTCCTC^TCTCCAC^C^GCTGCCA(^GCCGCCGCTACAAGCTCCGATTCTCCT 
TCTTCCGCCGCCGCTAACCAGTGGCTCTCACGATCCTCATCCTTCCTCCAACGAGGCAAC 
ACCGCAAACAACAACAACAACGAAACATCCGGTGACGTCATCGAAGACGTTCCCGGCGGA 
GAGGAGTCAATGATCGGAGAGAAGAAGGAGGCGGAGAGGTGGCAGAATGCGAGACACAAG 
GCGGAGATACTGTCTCATCCACTATACGAGCAACTTTTGTCGGCACACGTGGCGTGCCTG 
AGGATCGCAACGCCGGTGGATCAGCTTCCGAGGATAGACGCACAGCTTGCTCAGTCTCAA 
AACGTCGTGGCTAAGTACTCAACTTTAGAAGCCGCTCAAGGACTCCTCGCCGGCGATGAC 
AAGGAGCTTGACCACTTGATGACGCATTATGTACTATTC 

CTGCAACAGCATGTTCGTGTTCATGCAATGGAAGCTGTTATGGCCTGTTGGGAGATTGAA 

CAGTCGCTTCAAAGTTTTACAGGAGTATCTCCTGGTGAAGGCACAGGAGC^CAATGTCT 

GAGGATGAAGATGAGCAAGTAGAGAGTGATGCTCATTTGTTTGATGGAAGCTTAGATGGG 

TTAGGGTTTGGTCCTCTAGTTCCCACTGAGAGCGAGAGATCTTTGATGGAACGAGTCAGA 

CAAGAACTCAAACATGAACTCAAGCAGGGTTACAAGGAGAAAATTGTGGACATM 

GAGATACTGAGGAAGAGAAGAGCTGGAAAATTACCAGGAGACACCACCTCTGTTCTCAAA 

TCATGGTGGCAATCTCATTCTAAGTGGCCTTACCCTACTGAGGAAGATAAGGCGAGGTTG 

GTGCAGGAGACGGGTTTGCAGCTCAAACAGATAAACAATTGGTTCATCAATCAAAGAAAG 

AGGAATTGGCATAGCAATCCATCTTCTTCTACCGTCTCAAAGAATAAACGCCGAAGCAAT 

GCAGGTGAAAACAGCGGAAGAGACCGTTGAGATCAAGCTTGCATGTAGAGATCCAAAAGC 

TTTATAGAAAGGTGGAGGCATGAAGACAAAGAATTCTTACACAACAAACGTAGGACGTAA 

TTTTGTGCCAGTACATGGTATGGCTTTCATATTTGGTAATGATTAGGGCCACACAAAATT 

AAACCCCAAAGCATGATTTGTAATATGAGGTTTTAGATGGACTTTATGATAGGATCGTCA 

GTCTTCACTGCCATCTCCATTCTCCACC^TCAATCC^TCATTATATCTTGTGAAAAAAAA 

A 

>G427 Amino Acid Sequence (domain in AA coordinates: 307-370) 
MAFHNNHFNHFTDQQQHQPPPPPQQQCX2QHFQESA 

SDSPSSAAANQWLSRSSSFLQRGOTANNNNNETSGDVIEDVPGGEESMIGEK^ 

ARHKAEILSHPLYEQLLSAHVACLRIATPVDQLPRIDAQLAQSQNWAKYSTLEAAQGLL 

AGDDKELDHFMTHYVLLLCSFKEQLQQHTOVHAMEAVMACWEIEQSLQSFTGVSPGEGTG 

ATMSEDEDEQVESDAHLFDGSIiDGLGFGPLVPTESERSLMERVRQEIiKHELKQGYKEKIV 

DIREEILRKRRAGKLPGDTTSVLKSWWQSHSKWPYPTEEDKARLVQETGLQLKQINNWFI 

NQRKRNWHSNPSSSTVSKNKRRSNAGENSGRDR* 

>G509 (122.. 1054) 

CTTCCTCCTTTGCTAATAAACTTTTCTTTGAACCTTACACGCCTTGTTGATATTACTCTC 
TTAAATATATATTTTCGTACATTAACACAGACATATATAAAGCTAAAGATTTCTTCACGT 
AATGGGTTTGAAAGATATTGGGTCCAAATTGCCACCGGGGTTTCGATTTCATCCAAGTGA 
TGAAGAGTTGGTTTGTCATTATCTTTGCAACAAGATTAGGGCCAAATCTGATCATGGTGA 
TGTTGATGATGATGATGATGATGTTGATGAAGCTTTGAAGGGTTCTACTGATCTTGTGGA 
GATTGACTTGCATATCTGTGAGCCATGGGAGCTTCCTGATGTGGCAAAGTTAAACGCAAA 
GGAATGGTACTTCTTCAGTTTCCGTGATCGAAAGTATGCTACTGGATATCGCACGAACAG 
AGCGACAGTAAGCGGATACTGGAAAGCAACAGGAAAAGATCGAACGGTGATGGATCCACG 
TACAAGGCAATTGGTAGGGATGAGAAAAA(^CTAGTGTTCTACAGAAACAGAGCACCAAA 
TGGGATCAAAACTACTTGGATCATGCACGAGTTCCGTCTTGAGTGTCCTAACATCCCACA 
TAAGGAAGACTGGGTCTTGTGCAGAGTGTTCAACAAAGGCAGAGACTCATCGCTACAAGA 
CAATAATTATTATAACAATGATAATCAGACGCAAAGGCTTGAAGTTAATGACGCTCCGGA 
TCTTAATTACAACAATCAGTTGCCACCTTTGCTATCATCCCCTCCTCATAATCATCAACA 
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TGAGAAGATGAAAATCCAAGTTTGTGATCAGTGGGAGCAGCTAATGAAGCAGCCTTCAAG 

GACCACCGGCCACCCCTATCATCACCATTGTCATCATCAAACCATAGCATGTGGTTGGGA 

GCAGATGATGATCGGTTCGCTGTCATCACCTTCGAGTCATGGCCCTGATCACGAGTCCTT 

TGCTAAATTTGCTTTACCGTCGACAATAACAACAGTGTCAACATCAGTGGTGATCATCAT 

(^GAATTATGAGAAGATTTTGTTGTCATCACTAGACATGACGAGTTTGGATCACGACAAG 

ACATGTATGGGATCATCATCGGATGGTGGTATGGTCTCTGATCTTCACATGGAATGTGGT 

GGATTGAGTTTTGAGACCGAAAATATCCTCGCTTTCCAATGAACATAATTCAAGGGGTTC 

GCCAATTTGTTGATTCGTGAATTATACAAACATTTTATCTATAGATTTATCACATTATCA 

AACATGTAAGTTGTGTGGCATTTGGGTATAGGGTTTGTGTGATTCTAGGTTTTTAGGACG 

ATGTATGTTGTTATATTTAGCGTGTTTTTAGGATTTATTCTCATTTTAAAATTATATGAA 

AACCCATTACT^ATGAATACAATTAGTTTTCTTTGTTGTAAATAATATTTTA 

AAAAAAAAAAAAAA 

>G509 Amino Acid Sequence (domain in AA coordinates: 13-169) 
MGLKDIGSKLPPGFRFHPSDEELVCHYLCNKIRAKSDHGDVDDDDDDVDEALKGSTDIiVE 
IDLHICEPWELPDVAKXNAKEWYFFSFRDRKYATGYR 
TRQLVGMRKTLWYRNRAPNGIKTTWIMHEFRIjECPNIPHKEDWVL 

nnyynndnqtqrlevndapdlnyl^qlppllsspphnhqhekmkiqv 

ttghpyhhhchhqtiacgweqmmigslsspsshgpdhesfakfalpstittvstsvviii 

rimrrfcchh* 

>G519 (85.. 894) 

CACAAAGATCCTCCGATTCGAAGGTTTATAAAAACTCAAAATCGAATCTTATCCACAAGA 

AAACAACAAGGTACTTTTCCAAAAATGAAGGCGGAGTTGAATTTGCCGGCGGGATTCCGA 

TTTGATCCGACGGACGAAGAGCTTGTCAAGTTCTATCTTTGCCGGAGATGTGCGTCAGAA 

CCGATTAACGTTCCGGTTATCGCAGAGATTGACTTGTACAAATTCAATCCATGGGAGCTT 

CCAGAAATGGCGTTGTACGGTGAGAAAGAATGGTACTTCTTCTCGCATAGAGACCGGAAA 

TACCCAAACGGGTCGAGACCAAACCGGGCAGCTGGAACCGGTTATTGGAAAGCGACTGGA 

GCTGATAAACCGATCGGAAAACCGAAGACGTTAGGGATTAAGAAAGCACTCGTCTTCTAC 

GCAGGAAAAGCTCCGAAAGGGATTAAAACGAATTGGATTATGCACGAGTATCGTCTCGCT 

AATGTCGATCGATCTGCTTCTACCAACAAGAAGAACAACTTAAGACTTGATGATTGGGTT 

TTGTGTCGGATATACAATAAGAAAGGAACAATGGAGAAGTATTTACCGGCGGCGGCTGAG 

AAACCGACGGAAAAGATGAGTACGTCGGACTCAAGATGCTCAAGTCACGTGATTTCACCG 

GACGTCACGTGTTCTGATAACTGGGAGGTTGAGAGTGAGCCCAAATGGATTAATCTGGAA 

GACGCGTTAGAGGCATTTAATGATGACACGTCCATGTTTAGTTCCATTGGTTTGTTGCAA ' 

AATGACGCCTTTGTTCCTCAGTTTCAGTACCAGTCCTCCGATTTCGTCGATTCGTTTCAG 

GACCCGTTCGAGCAGAAACCGTTCTTGAATTGGAATTTTGCTCCTCAAGGGTAAAAATAA 

TCGGCAAAAAGTTGAAGCTTTTCAGAGTCTTCGATCACCGGCATTGTGTCGGATCCTGAC 

CCGGAGACCAAGTCGGGTCATACGATTACATAATCGGGTTATTGAGATTTCCACATTTGG 

ATTTCCGAGACTAACCAACTTAACGGATTCTGGGGTAATTGGGGGGTTTTGCACAGGTGA 

ATCACACTGAGTCAGCAAGTTTCGATTTTTTGGTTTTGTTTT 

TCTAAAGATATCACGAAGTAGATTCAGAAGAACTGTAAAAGCAATTGTGACCACCCGTTA 
TGAATC7VTAAATATATTCAATGAAGCATC 

>G519 Amino Acid Sequence (conserved domain in AA coordinates: 11-104) 
MKAELNLPAGFRFHPTDEELVKFYLCRRCASEPINVPV^ 
KEWYFFSHRDRKYPNGSRPNRAAGTGYWKATGADKPIGKPKTLG 
KTNWIMHEYRLANVDRSASTNKKNIILRLDDWv^ 

SDSRCSSHVISPDVTCSDNWEVESEPKWINLEDALEAFNDDTSMFSSIGLLQNDAFVPQF 

QYQSSDFVDSFQDPFEQKPFLNWNFAPQG* 

>G561 (86..1ie6) 

AATTTGTTTTTTTTTCTTTTGTGGGTTCAATTCGAATTGTTTTCCCTGAGACTCAAGTTA 
CTGTGTCATTACTCTGCATTGAGCAATGGGTAGCAACGAAGAAGGAAACCCCACTAACAA 
CTCTGATAAGCCATCGCAAGCTGCTGCTCCTGAGCAGAGTAATGTTCATGTGTATCATCA 
TGACTGGGCTGCTATGCAGGCATATTATGGGCCTAGAGTTGGTATACCTCAATATTACAA 
CTCAAATTTGGCGCCTGGTCATGCTCCACCGCCTTATATGTGGGCGTCTCCATCGCCAAT 
GATGGCTCCTTATGGAGCACCATATCCACCATTTTGCCCTCCTGGTGGAGTTTATGCTCA 
TCCTGGTGTTCAAATGGGCTCACAACCACAAGGTCCTGTTTCTCAATCAGCATCTGGAGT 
TACAACCCCTTTGACCATTGATGCACCAGCTAATTCAGCTGGA7^ACTCAGATCATGGGTT 
CATGAAAAAGCTGAAAGAGTTCGATGGACTTGCAATGTCAATAAGCAATAACAAAGTTGG 
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GAGTGCTGAACATAG CAG CAGTGAACATAGGAGTTCTCAGAGCTCCG AGAATG ATGGCTC 
TAGCAATGGTAGTGATGGTAATACAACTGGGGGAGAACAATCTAGGAGGAAAAGAAGGCA 
ACAAAGATCACCAAGCACTGGTGAAAGACCCTCATCTCAAAACAGTCTGCCTCTTAGAGG 
TGAAAATGAGAAACCCGATGTGACTATGGGGACTCCTGTTATGCCCACAGCAATGAGTTT 
CCAAAACTCTGCTGGCATGAACGGTGTGCCACAGCCATGGAATGAAAAAGAGGTTAAACG 
AGAGAAGAGAAAACAGTCAAACCGAGAATCTGCTAGGAGGTCAAGACTGAGGAAGCAGGC 
TGAAACAGAACAACTATCTGTCAAAGTTGACGCATTAGTAGCTGAGAACATGTCTCTGAG 
GTCTAAACTAGGCCAGCTAAACAATGAGTCTGAGAAACTACGGCTGGAGAACGAAGCTAT 
ATTGGATCAACTGAAAGCGC^GCAACAGGGAAAACAGAGAACCTGATCTCTCGAGTTGA 
TAAGAACAACTCTGTATCAGGTAGCAAAACTGTGCAGCATCAACTGTTAAATGCAAGTCC 
GATAACCGATCCTGTCGCGGCTAGCTGACCGTGGCCGCAACAATGAGAACCCGATATTTC 
TTCCTTTGGGTTGTGATTGTAACTTAAAAGGAGACTTTTTGTTTTTATTCTTAGATTTGT 
AGCTCTCTGCATAGTGAGCATAAATTGATGTAATATGGTTTAAGAGATTCGGTGTTCTCT 
GGTGTGTGCTGCAACCACATAATTGGTGATAGATAGGTTTAGTTATATAAGCAAATGTAT 
TAGAGATAAGGGGAGACATATTTGATGGTCTTT 

>G561 Amino Acid Sequence (domain in AA coordinates: 248-308) 
MGSNEEGNPTNNSDKP S QAAAPEQSNVHVYHHDWAAMQAY YGPRVGI PQYYNSNLAPGHA 
PPPYMWASPSPMMAPYGAPYPPFCPPGGVYAHPGVQMGSQPQGPVSQSASGVTTPLTIDA 
PANSAGNSDHGFMKKLKEFDGLAMSISNNKVGSAEHSSSEHRS 

TGGEQSRRKRRQQRS PSTGERPSSQNSLPLRGENEKPDVTMGTPVMPTAMS FQNSAGMNG 
VPQPWNEKEVKREKRKQSNRESARRSRLRKQAETEQLSVKTOALVAENMSLR 
ESEKLRLENEAI LDQLKAQATGKTENLI SRVDKNNSVS GSKTV QHQLLNASP I TDPVAAS 
* 

>G590 (102.. 1223) 

TCGACAGACACTCTCCCTCTCTCCATGCCCATAAAATCTCAAAGACTGTTTAAAAAAAAA 
AATGTTTTAGCTTTAACTGCTTTTTTTTTGTTGTTGGTGTAATGATATCACAGAGAGAAG 
AAAGAGAAGAGAAGAAGCAGAGAGTGATGGGAGATAAGAAATTGATTTCATCTTCTTCTT 
CTTCCTCGGTTTACGATACTCGTATCAATCATCATCTTCATCATCCTCCGTCTTCTTCCG 
ACGAAATCTCTCAGTTTCTCCGGCATATTTTCGACCGTTCTTCTCCTTTACCTTCTTACT 
ACTCCCCGGCGACGACTACAACGACGGCGTCTTTGATTGGTGTGCACGGGAGCGGTGACC 
CACATGCAGATAACTCGAGAAGTCTCGTTTCTCATCATCCACCGTCAGATTCTGTGCTTA 
TGTCGAAACGTGTCGGAGATTTCTCTGAGGTTTTAATCGGCGGAGGATCAGGCTCAGCCG 
CCGCGTGTTTTGGTTTCTCCGGTGGTGGTAATAATAACAACGTTCAAGGAAATAGCTCTG 
GGACTCGAGTATCGTCTTCTTCCGTTGGAGCTAGTGGCAACGAGACAGATGAGTATGACT 
GTGAAAGCGAGGAAGGAGGAGAAGCTGTAGTTGATGAAGCTCCCTCTTCCAAGTCAGGTC 
CTTCTTCTCGTAGTTCATCTAAAAGATGCAGAGCTGCTGAAGTTCATAATCTCTCTGAGA 
AGAGGAGGAGAAGTAGAATTAATGAAAAAATGAAAGCTTTACAAAGTCTCATCCCTAATT 
CAAATAAGACGGATAAGGCTTCAATGCTTGATGAAGCCATTGAGTATCTGAAACAGCTTC 
AGCTCCAAGTTCAGATGTTGACTATGAGAAATGGAATAAACTTGCATCCTTTGTGTTTAC 
CTGGAACTACATTACACCCATTGCAACTCTCTCAGATTCGACCCCCTGAAGCAACCAATG 
ATCCTCTGCTTAATCATACCAATCAGTTTGCTTCGACTTCTAATGCACCGGAAATGATCA 
ATACTGTGGCTTCTTCATACGCTTTGGAACCTTCTATTCGCAGTCACTTTGGACCTTTCC 
CTCTCCTTACTTCACCCGTGGAGATGAGTCGGGAAGGTGGGTTAACTCATCCAAGGTTGA 
ACATTGGTCATTCCAACGCAAACATAACCGGGGAACAAGCTCTGTTTGATGGACAACCTG 
ACCTAAAAGATCGAATTACTTGAACAGTGTCCCAACTTCGGGATCTCTATGTGTTCTTGT 
TTCTTAGAACGCAAGCCATAAAGCTGTCTGAC 

>G590 Amino Acid Sequence (domain in AA coordinates: 202-254) 

MISQREEREEKKQRVMGDKKLISSSSSSSVYDTRINHHLHHPPSSSDEISQFLRHIFDRS 

SPLPSYYSPATTTTTASLIGVHGSGDPHADNSRSLVSHHPPSDSVLMSKRVGDFSEVLIG 

GGSGSAAACFGFSGGGNNNJTVQGNSSGTRVSSSSVGASGNETDEYDCESEEGGEAVVDEA 

PSSKSGPSSRSSSKRCRAAEVHNLSEKRRRSRINEKMKALQSLIPNSNKTDKASMLDEAI 

EYLKQLQLQVQmjTMRNGINLHPLCLPGTTLHPLQLSQIRPPEATNDPLLNHTNQFASTS 

NAPEMINTVASSYALEPSIRSHFGPFPLLTSPVEMSREGGLTHPRLNIGHSNANITGEQA 

LFDGQPDLKDRIT* 

>G818 (65.. 1060) 

GTATTTCTTACAATAAACGACCAAAAAGTTAATACAAGA7UVTAGAAACGGTGTAGGAAGC 
TACTATGACGGCAATTCCAAACGTCGTCGATATTGAATCTTCTTCCTCTTCGCTTTGTCA 
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AGAGACGGCAACGGAGACCGTCACCGTTGAAAGAGGCTCGTCTGATTCATCTTCAAAGCC 

AGACGACGTCGTTTTACTAATCAAGGAAGAGGAGGATGACGCGGTTAACTTGTCACTTGG 

TTTTTGGAAATTGCACGAGATAGGTTTAATAACACCGTTCTTGAGAAAGACGTTTC 

CGTCGATGACAAAGTAACAGACCCGGTTGTATCATGGAGCCCGACCCGTAAAAGCTTTAT 

CATTTGGGATTCTTACGAGTTCTCAGAGAATCTACTTCCCAAATACTTCAAGCACAAGAA 

CTTCTCCAGTTTTATTCGTCAGCTTAACTCTTA 

GTGGGAATTTGCTAACGAAGGGTTTCAAGGAGGGAAGAAACATTTGCTTAA 

GAGGAGAAGCAAAAACACTAAATGTTGTAACAAGGAAGCGAGTACCACCACGACAGAGAC 

TGAGGTTGAGTCATTGAAGGAGGAACAGAGTCCAATGAGATTGGAGATGTTGAAGCTGAA 

ACAAC/^CAAGAAGAATCTCAACATCAGATGGTCACTGTGCAGGAGAAGATCCACGGAGT 

TGATACCGAACAACAGCATATGCTTAGTTTCTTTGCAAAGTTGGCTAAAGATCAAAGATT 

TGTAGAGAGACTGGTGAAGAAGAGAAAGATGAAAATACAGAGAGAGCTAGAAGCAGCTGA 

ATTCGTGAAGAAGCTCAAGTTGCTTCAGGATCAAGAAACTCAAAAGAACTTGTTAGATGT 

AGAAAGAGAATTTATGGCCATGGCTGCAACAGAACACAATCCCGAGCCTGACATTTTGGT 

GAACAATCAAAGCGGGAATACGAGATGTCAGCTTAACTCAGAGGACCTACTTGTTGACGG 

TGGCTCAATGGATGTAAATGGGAGGATAGAGATAGAGTAGAGCAAAACCGGTAACATAGC 

AATAGAGAAGGTACCAAATCCCAAGGCTTGAGATCCGAAT 

>G818 Amino Acid Sequence (domain in AA coordinates: 70-162) 

MTAIPNVVDIESSSSSLCQETATETVTVERGSSDSSSKPDDVVLLIKEEEDDAVNLSLGF 

WKLHEIGLITPFLRKTFEIVDDKVTDPWSWSPTRKSFIIWDSYEFSEl^LPKYFKHKNF 

SSFIRQLNSYGFKKTOSDRWEFANEGFQGGKKHLLKNIKRRSK^ 

VESLKEEQSPMRLEMLKLKQQQEESQHQMVTV^ 

ERLVKKRKMKI QRELEAAEFVKKLKLLQDQETQKNLLDVEREFMAMAATEHNPEPD I LVN 

NQSGNTRCQLNSEDLLVDGGSMDVNGRIEIE* 

>G849 (218.. 2077) 

AACTCGAGAATTCTTCATTTCTTTTAAATCTTAGAATCTCGAGTTTTTGTATAAATCGAT 
TCTAATTTTTCCTTTGTACATTGTTTTATATATACATAAAACACACAAATCGGGTATGGG 
GGAATTTGGGTTTTAAGATAGCGTGATCTGTAATAATAAGTGGTTCGCGATCGTGATCAA 
GAAACTGGTGGCTGATAGTGATATGCATATTTGAGAGATGGTGTTCAAGAGAAAGTTAGA 
TTGCCTTTCCGTGGGATTTGATTTTCCCAACATTCCCAGAGCTCCTCGTTCATGCAGGAG 
GAAGGTTCTAAACAAGAGGATTGATCATGATGATGATAACACTCAGATCTGTGCAATTGA 
CTTACTAGCTTTGGCTGGAAAGATTCTACAGGAAAGCGAGAGTTCCTCTGCGTCTTCTAA 
TGCATTTGAAGAAATTAAGCAAGAGAAAGTAGAAAATTGCAAGACTATTAAATCTGAGTC 
TTCTGACCAAGGAAACTCTGTGTCAAAGCCTACTTATGATATCTCTACTGAGAAGTGTGT 
GGTGAACAGTTGTTTTTCATTTCCGGATAGTGACGGCGTTTTGGAGCGGACTCCGATGTC 
TGATTACAAGAAGATTCATGGTTTGATGGATGTAGGGTGTGAAAACAAGAATGTAAATAA 
TGGGTTCGAGCAAGGAGAAGCAACCGATCGCGTGGGTGATGGAGGCTTAGTCACTGATAC 
TTGCAACTTAGAGGATGCAACTGCGTTAGGTCTGCAGTTTCCGAAATCAGTCTGTGTGGG 
TGGTGATTTAAAATCACCATCCACCTTGGATATGACCCCTAATGGTTCCTATGCTAGACA 
TGGGAACCATACTAACCTAGGTAGAAAAGATGATGATGAAAAATTCTATAGTTACCATAA 
ACTTAGCAATAAATTTAAGTCGTATAGGTCTCCAACAATTCGAAGAATAAGAAAGTCCAT 
GTCGTCCAAATACTGGAAACAAGTTCCAAAAGATTTTGGATACAGTAGAGCTGATGTGGG 
TGTGAAGACTCTTTATCGCAAAAGAAAATCATGTTATGGTTACAACGCATGGCAGCGTGA 
GATCATTTATAAGAGAAGAAGATCACCTGACAGAAGCTCGGTCGTAACTTCTGATGGAGG 
ACTCAGTAGTGGAAGTGTTTCCAAGTTACCCAAGAAGGGAGATACAGTAAAGCTAAGCAT 
TAAGTCCTTTAGGATTCCAGAGCTTTTTATTGAAGTTCCAGAAACTGCAACAGTAGGATC 
ACTAAAGAGGACTGTGATGGAGGCTGTCAGTGTTTTACTCAGCGGAGGAATACGTGTTGG 
GGTGTTAATGCATGSGAAGAAGGTTAGAGATGAAAGGAAAACTCTGTCCCAGACTGGGAT 
CTCATGTGATGAAAATCTAGACAACCTTGGGTTCACCTTGGAGCCTAGTCCCAGCAAAGT 
TCCCCTACCTTTGTGTTCTGAAGATCCTGCTGTGCCAACCGACCCTACAAGTTTGTCTGA 
ACGGTCTGCGGCGTCTCCTATGCTAGATTCTGGAATTCCACATGCAGATGACGTGATTGA 
TTCAAGAAATATTGTGGACAGTAACCTCGAATTAGTTCCATATCAGGGTGACATATCTGT 
TGATGAACCTTCATCAGATTCAAAAGAGCTTGTCCCACTTCCAGAGTTGGAAGTCAAGGC 
GCTTGCCATAGTTCCGTTGAACCAGAAACCTAAGCGTACTGAGCTAGCCCAGAGGAGAAC 
TAGGAGACCCTTCTCTGTGACAGAGGTAGAAGCTCTTGTACAAGCAGTTGAGGAACTCGG 
GACTGGAAGATGGCGTGATGTAAAATTGCGTGCTTTCGAGGATGCAGATCATCGGACTTA 
CGTGGACTTGAAGGACAAATGGAAGACGCTAGTTCACACAGCAAGTATATCCCCACAGCA 
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ACGAAGAGGAGAGCCGGTGCCACAAGAACTGCTAGACAGAGTCTTGAGGGCATACGGGTA 
TTGGTCGCAGCACCAAGGAAAACATCAGGCGAGAGGAGCGTCCAAAGATCCAGACATGAA 
CAGAGGTGGAGCTTTTGAATCAGGTGTTTCAGTGTAAAAAAGGAGGTACGCATTGGTGGG 
TGGGTGTACAGAAGCAAACAACACAATAAATGGACAACTCAATTTCTGCAAAGTTTAATT 
GTCTTTATTTCTCGTTTTTTTTTTTTTTTCTCCTACATACACTTT^ 

>G849 Amino Acid Sequence (domain in AA coordinates: 324-413, 504-583) 

MVFKRKLDCLSVGFDFPNIPRAPRSCRRKVLNKRIDHDDDNTQICAIDLLALAGKILQES 

ESSSASSNAFEEIKQEKVENCKTIKSESSDQGNSVSKPTYDISTEKCWNSCFSFPDSDG 

VLERTPMSDYKKIHGLMDVGCENKNVNNGFEQGEATDRVGDGGLVTDTQJLEDATALGLQ 

FPKSVCVGGDLKSPSTLDMTPNGSYARHGNHTNLGRKDDDEKFYSYHKLSNKFKSYRSPT 

IRR IRKSMSS KYWKQVPKDFGYSRADVGVKTLYRKRKS CYG YNAWQRE 1 1 YKRRRS PDRS 

SVVTSDGGLSSGSVSKLPKKGDTVKLSIKSFRIPELFIEVPETATVGSLKRTVMEAVSVL 

LSGGIRVGVLMHGKKVRDERKTLSQTGISCDENLDNLGFTLEPSPSKVPLPLCSEDPAVP 

TDPTSLSERSAASPMLDSGIPHADDVIDSRNIVDSNLELVPYQGDISVDEPSSDSKELVP 

LPELEVKAIiAIVPLNQKPKRTELAQRRTRR P FS VTEVEALVQAVEELGTGRWRDVKLRAF 

EDADHRT YVDLKDKWKTLVHTAS I S PQQRRGEPVPQELLDRVLRAYGYWSQHQGKHQARG 

AS KDPDMNRGGAFESGVS V* 

>G892 (21.. 1004) 

TATAACAATTCCTTCCAACAATGTCATTGAGTCAGCCAATAACACGGACCGATAGTGCAC 
CCAATGGAGCATTTAGGACTTTTGGTCTCTACTGGTGCTACCATTGTGATCGTATGGTCA 
GAATTGCATCCTCTAACCCATCAGAGATCGCCTGTCCTCGATGTTTGAGGCAATTTGTCG 
TTGAGATTGAAACGAGACAACGGCCTCGGTTTACTTTCAACCATGCTACTCCGCCTTTTG 
ATGCTTCTCCTGAGGCTCGTCTTCTCGAAGCTCTCTCGCTCATGTTTGAGCCTGCAACCA 
TAGGTAGGTTTGGTGCAGACCCATTTCTTAGGGCAAGATCCAGAAACATCTTGGAACCTG 
AATCAAGACCCCGACCGCAACATCGAAGACGACACAGCCTTGACAATGTTAACAATGGTG 
GTTTACCTCTACCAAGAAGAACATATGTTATTCTCCGGCCCAATAATCCGACTAGTCCAC 
TCGGAAACATAATTGCGCCACCAAATCAAGCACG^^ 

ACTTTACTGGAGCATCAAGCTTAGAGCAGCTGATTGAACAACTAACACAAGACGATAGGC 

CTGGACCACCACCTGCGTCAGAACCCACCATTAATTCCCTACCATCTGTGAAAATAACAC 

CACAACATCTAACTAACGACATGTCCCAATGCACAGTGTGCATGGAAGAATTCATTGTTG 

GTGGGGACGCAACGGAATTACCATGTAAACATATTTACCATAAAGATTGTATAGTCCCGT 

GGCTTAGGCTTAACAATTCTTGCCCTATCTGCCGCCGTGACCTGCCACTTGTCAACACCG 

TTGCTGAATCTCGAGAAAGGAGCAATCCTATTAGACAAGACATGCCTGAAAGAAGGCGTC 

CAAGGTGGATGCAACTCGGTAACATTTGGCCATTTAGAGCAAGATACCAAAGGGTTAGTC 

CAGAAGAAACAGCAAACCAGAATCCTCGAGATAACAGGAGCTAACTCTGAATATTCCATG 

GGAAATAAAAATCGTGACTATCTATATGTATAGACTCTATGAGACATTGTCTATTTGAAT 

GTGCATGTATATCTCT^GAAATAAACTCAAGCGAAAGATATTTAACGACTAAAAAAAA 

>G892 Amino Acid Sequence (domain in AA coordinates: 177-270) 

MSLSQPITRTDSAPNGAFRTFGLYWCYHCDRM^ 

RPRFTFNHATPPFDASPEARLLEALSLMFEPATIGRFGADPFIiRARSRNILEPESRPRPQ 

HRRRHSLDNVimGGLPLPRRTYVILRPNNPTSPLGNIIAPPNQAPPRHVNSHDYFTGASS 

LEQLIEQLTQDDRPGPPPASEPTINSLPSVKITPQHLTNDMSQCTVCMEEFIVGGDATEL 

PCKHIYHKDCIVPWLRLNNSCPICRRDLPLVNTVAESRERSNPIRQDMPERRRPRWMQLG 

NIWPFRARYQRVSPEETANQNPRDNRS* 

>G961 (1..1200) 

ATGTCAAAATCTATGAGCATATCAGTGAACGGACAATCTCAAGTGCCTCCTGGGTTTAGG 
TTTCATCCGACCGAGGAAGAGCTGTTGCAGTATTATCTCCGGAAGAAAGTTAATAGCATC 
GAGATCGATCTTGATGTCATTCGCGACGTTGATCTCAACAAGCTCGAGCCTTGGGACATT 
CAAGAGATGTGTAAAATAGGAACAACGCCACAAAACGACTGGTATTTCTTTAGCCACAAG 
GACAAAAAATATCCGACGGGAACGAGAACTAACAGAGCCACTGCGGCTGGATTTTGGAAA 
GCAACTGGCCGCGACAAGATCATATATAGCAATGGCCGTAGAATTGGGATGAGAAAGACT 
CTTGTTTTCTACAAAGGCCGAGCTCCTCACGGCCAAAAATCTGATTGGATCATGCATGAA 
TATAGACTCGATGACAACATTATTTCCCCCGAGGATGTCACCGTTCATGAGGTCGTGAGT 
ATTATAGGGGAAGCATCACAAGACGAAGGATGGGTGGTGTGTCGTATTTTCAAGAAGAAG 
AATCTTCACAAAACCCTAAACAGTCCCGTCGGAGGAGCTTCCCTGAGCGGCGGCGGAGAT 
ACGCCGAAGACGACATCATCTCAGATCTTCAACGAGGATACTCTQGACCAATTTCTTGAA 
CTTATGGGGAGATCTTGTAAAGAAGAGCTAAATCTTGACCCTTTCATGAAACTCCCAAAC 
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CTCGAAAGCCCTAACAGTCAGGCAATCAACAACTGCCACGTAAGCTCTCCCGACACTAAT 
CATAATATCCACGTCAGCAACGTGGTCGACACTAGCTTTGTTACTAGCTGGGCGGCTTTA 
GACCGCCTCGTGGCCTCGCAGCTTAACGGACCCACATCATATTCAATTACAGCCGTCAAT 
GAGAGCCACGTGGGCCATGATCATCTCGCTTTGCCTTCCGTCCGATCTCCGTACCCCAGC 
CTAAACCGGTCCGCTTCGTACCACGCCGGTTTAACACAGGAATATACACCGGAGATGGAG 
CTATGGAATACGACGACGTCGTCTCTATCGTCATCGCCTGGCCCATTTTGTCACGTGTCG 
AATGTTTTGCTGCTTGTTTGTCTCCTTCGTCTGCAGCTTCAGTTCTGGCCGTTCCAACCA 
TGGCAGAGGCAGGTTCATTTCGATCTTTCATCGCCTCAGATGCAGATCTCTCTCCATTGA 

>G961 Amino Acid Sequence (conserved domain in AA coordinates: 15-140 

MSKSMSISWGQSQVPPGFRFHPTEEELLQYYLRKKVNSIEIDLDVIRDVX)LNKLEPWDI 

QEMCKIGTTPQNDWYFFSHKDKKYPTGTRTNRATAAGFWKATGRDKIIYSNGRRIGMRKT 

LVFYKGRAPHGQKSDWIMHEYRLDDNIISPEDVTVHEWSIIGEASQDEGWWCRIFKKK 

NLHKTLNSPVGGASLSGGGDTPKTTSSQIFNEDTLDQFLELMGRSCKEELNLDPFMKLPN 

LESPNSQAINNCHVSSPDTNHNIHVSNWDTSFWSWAALDRLVASQLNGPTSYSITAVN 

ESHVGHDHLALPSWSPYPSLNRSASYHAGLTQEYTPEMELWNTTTSSLSSSPGPFCHVS 

NVLLLVCLLRLQLQFWPFQPWQRQVHFDLSSPQMQISLH* 

>G1465 (163.. 1125) 

TATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTGGAGAGGACACGC 
TGACAAGCTGACTCTAGCTTATCTGGTACCGTCGACCTCATTCTTGCGTTTGATCTTTCT 
TTCTCTAGATCCCATATTTTTCTTGATCAATTTAGTTTC^TTATGGAGGAAGATGCAGCT 
TTTGATCTACTCAAAGCCGAACTCTTAAACGCAGAAGACGATGCAATAATCTCACGTTAT 
CTGAAGCGTATGGTCGTCAACGGAGACTCATGGCCTGATCACTTCATCGAAGACGCAGAC 
-GTGTTCAACAAGAATCCAAATGTGGAGTTCGATGCTGAGAGCCCTAGCTTCGTGATAGTT 
AAACCTCGAACAGAGGCTTGTGGTAAAACCGATGGATGTGAAACTGGTTGCTGGAGGATC 
ATGGGTCGTGATAAACCGATAAAATCGACGGAGACTGTGAAGATTCAAGGGTTCAAGAAG 
ATTCTCAAGTTCTGCCTAAAGAGGAAACCTAGAGGATACAAGAGAAGTTGGGTAATGGAA 
GAGTATAGGCTTACCAATAACTTGAACTGGAAGCAAGATC^TGTGATTTGCAAGATTCGG 
TTTATGTTTGAAGCTGAAATCAGTTTCTTGCTAGCCAAGCATTTCTACACTACATCAGAA 
TCACTTCCTCGAAATGAGCTGTTGCCAGCTTACGGATTCCTTTCATCAGATAAGCAATTG 
GAGGATGTATCTTATCCGGTGACGATAATGACTTCTGAAGGAAACGATTGGCCTAGCTAC 
GTTACCAACAATGTGTATTGTCTGCATCCATTGGAGCTCGTTGATCTTCAAGATCGGATG 
TTTAATGATTACGGAACCTGCATCTTCGCTAACAAGACTTGTGGTAAAACCGATAGATGC 
ATTAATGGTGGTTACTGGAAAATTTTGCACCGTGATAGGCTGATCAAGTCAAAGTCCGGG 
ATAGTTATTGGTTTC^^GAAGGTGTTTAAGTTTC^TGAAACGGAGAAAGAAAGATACTTC 
TGTGGTGGAGAAGATGTGAAGGTAACTTGGACTCTAGAAGAGTATAGGCTTAGCGTGAAG 
CAGAATAAATTCTTGTGCGTTATCAAGTTTACTTATGATAACTAAGAATCTTTTCTTTGG 
ATTTTATGATCATCTTAGTATCGCGACCGCTCTAGACAGGCCTCGTACCGGATCCTCTAG 
CTAGAGCTTTCGTTCGTATCATCGGTTTCGACAACG 

>G1465 Amino Acid Sequence (conserved domain in AA coordinates: 242- 
MEEDAAFDLLKAELLNAEDDAI I SRYLKRMWNGDSWPDHFIEDADVFNKNPNVEFDAES 
PSFVIV1CPRTEACGKTDGCETGCWRIMGRDKPIKSTETVXIQGFKKILKFCLKRKPRGYK 

RSWV14EEYRLTNNLNWKQDHVICKIRFMFEAEISF 

S SDKQLEDVS YPVTIMTSEGNDWPS YVTNNVYCLHPLELVDLQDRMFND YGTCI FANKTC 

GKTDRCINGGYWKILHRDRLIKSKSGIVIGFK^ 

YRLSVKQNKFLCVIKFTYDN* 

>G425 (45.. 1196) 

GAAAAC^GTCTTCTCTTCTCCGATCC(^AAAACGCAGGAAAAC^TGTCGTTTAAC^GCTCCC 

ACCTCCTTCCTCCACAAGAAGACCTTCCTCTCCGACACTTCACCGATCAATCACAGCAACCTC 

CGCCGCAGCGTCACTTCTCTGAAACACCTTCGCTTGTCACCGCCAGTTTCCTCAACCTCCCTA 

CCACCCTTACCACTGCGGATTCCGATCTCGCTCCTCCGCACCGCAACGGAGACAATTCCGTT 

GCTGATACAAACCCACGCTGGCTCTCCTTTCATTCGGAGATGCAAAATACTGGAGAAGTACG 

TTCTGAAGTTATCGACGGAGTCAACGCCGATGGTGAAACGATACTCGGCGTTGTAGGAGGT 

GAAGATTGGCGGAGTGCTAGCTATAAGGCGGCGATTTTAAGACATCCGATGTACGAGCAGC 

TTCTTGCGGCTCACGTGGCTTGCCTTAGGGTTGCGACTCCCGTTGACCAGATTCCGAGGATC 

GATGCTCAGCTCAGTCAGTTGCATACCGTCGCCGCGAAATACTCCACTCTTGGTGTGGTTGTT 

GACAACAAGGAACTTGATCATTTCATGTCACATTATGTTGTCTTGTTATGTTCATTTAAAGAACA 

ACTCCAACACCACGTTTGTGTCCATGCAATGGAAGCCATTACGGCTTGTTGGGAGATTGAACA 
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ATCACTGCAATCCCTAACTGGAGTTTCTCCAAGTGAAAGTAATGGTAAGACAATGTCGGATGA 

TGAAGATGATAATC7\AGTAGAGAGCGAGGTGAAGATGTTTGATGGAAGTTTGGACGGTTCAG 

ATTGCTTGATGGGGTTTGGTCCTCTTGTTCCAACCGAGAGAGAGAGATCTTTGATGGAACGTG 

TGAAGAAAGAACTGAAGCATGAGCTTAAACAGGGTTTCAAAGAGAAGATTGTGGACATAAG 

AGAAGAGATAATGAGGAAGAGAAGGGCGGGAAAGTTGCCAGGAGATACGACTTCTGTACT 

CAAAGAATGGTGGCGAACTCACTCGAAATGGCCATACCCAACTGAGGAAGATAAGGCAAAA 

GAAACTGGAACAGCAACTCTTCCACGTCATCTACTCTCACCAAGAACAAACGTAAACGGACC 

GGGAAGTCGTAGGTGACATAGCGGCTAACTAGAGGATGGTTCTTTGCCATGTGAATTCTTGG 

GAACCGTATATGAAAGAAACGAATCCGGTTCTATGCTCGTACAGAGTGTGTTATTTGTATAGT 

GGATACCGGTTAGCCTATGAAACCGGATTCTGGAGTCCAAATTGTTGTTTGTAACGACTTAGT 

AGTTTTTGGAAGTGATCTGTTTCGTTGGTTTGCGTCTTC 

TTTTTTCTTGTAAAGTGTCAATATGTTCGT 

AAACTAGCTTGAAATGTAAAAAAAAAAAAAAAA 

>G425 Amino Acid Sequence (domain in AA coordinates : TBD) 

MSFNSSHIiLPPQEDLPLRHFTDQSQQPPPQRHFSETPSLVTASFLNLPTTLTTMSDLAPPHR 

NGDNSVADTNPRWIiSFHSEMQNTGEVRSEVIDGVNADGETILGVVGGEDWRSASYKAAILR 

HPMYEQLLAAHVACLRVATPVDQIPRIDAQLSQLHTVAAIOrSTLGVVVDNKELDHFMSHYV^ 

LCSFKEQLQHHVCVHAMEAITACWEIEQSLQSLTGVSPSESNGKTMSDDEDDNQVESEVNM 

FDGSLDGSDCLMGFGPLVPTERERSLMERVKKELKHELKQGFKEKIVDIREEIMRKRRAGKLP 

GDTTSVLKEWWRTHSKWPYPTEEDKAKLVQETGLQLKQINNWFINQRKRNWNSNSSTSSTLT 

KNKRKRTGKS* 

>G347 (1..570) 

atgaaagtagcagatatgcaggaccagctggtgtgtcatggttgtaggaatttattgatg 
tatcctagaggagcatctaatgtgcgttgtgcgttatgtaacactatcaacatggttcct 
cctcctcctccacctcacgacatggcacacattatatgtggtggttgtagaacaatgctt 
atgtatacgcgtggggctagtagcgtaagatgctcttgctgtcaaactacgaaccttgtg 
ccagcgcactccaatcaggttgcccatgctccttccagtcaggttgcgcagatcaattgt 
gggcattgtcggacgaccctcatgtatccttacggtgcatcatccgtcaaatgcgctgtt 
tgtcaattogtaactaacgttaatatgagcaatggaagggtacctctcccaactaaccgg 
ccaaatggaacagcttgtcccccctctacatcaacttcaacaccaccctctcagacccaa 
accgttgttgtagaaaaccccatgtccgttgatgaaagcggaaagttggtgagcaatgtt 
gttgttggagtgacaactgacaaaaagtaa 

>G347 Amino Acid Sequence (domain in AA coordinates: 9-39, 50-70, 80-127) 
MKVADMQDQLVCHGCRNLMYPRGASNWCAL 

MYTRGAS S VRCS CCQTTNLVPAHSNQVAHAPS SQVAQ INCGHCRTTLMYP YGAS S VKCAV 
CQFVTNVNMSNGRVPLPTNRPNGTACPPSTSTSTPPSQTQTVVVENPMSVDESGKLVSW 
WGVTTDKK* 
>G1512 (1..732) 

ATGGAAGGGAACTTCTTCATCAGGTCTGATGCTCAACGAGCACATGACT^ATGGCTTCATA 
GCCA7UVCAAAAACCTAATCTCACCACGGCTCCAACAGCAGGTCAAGCTAATGAAAGTGGC 
TGTTTTGACTGCAACATCTGTTTAGACACAGCCCATGATCCGGTGGTCACTCTCTGCGGG 
CACCTTTTCTGCTGGCCTTGCATTTACAAGTGGTTACATGTTCAGTTATCTTCTGTCTCC 
GTTGATCAGGACCAGAACAATTGCCCTGTTTGTAAATCCAACATTACTATCACCTCTTTG 
GTTCCTCTCTATGGAAGAGGCATGTCTTCGCCTTCTTCCACGTTTGGCTCCAAGAAACAA 
GACGCACTGTCCACTGACATACCCCGCAGACCTGCTCCATCAGCCTTACGCAATCCGATT 
ACGTCAGCATCATCTCTGAACCCAAGCTTGCAACATCAAACTCTGTCTCCTTCATTTCAT 
AATCATCAGTATTCCCCTCGTGGCTTCACCACAACCGAATCAACCGACCTTGCCAATGCT 
GTAATGATGAGTTTeCTCTACCCTGTGATTGGAATGTTTGGAGACCTGGTCTACACCAGG 
ATATTCGGGACCTTCACAAACACAATAGCTCAGCCTTACCAAAGCCAGAGGATGATGCAG 
CGTGAGAAGTCTCTTAATCGGGTATCGATATTCTTCCTTTGTTGCATCATCCTTTGCCTC 
CTTCTCTTCTAG 

>G1512 Amino Acid Sequence (domain in AA coordinates: 39-93) 
MEGNFFI RSDAQRAHDNGF IAKQKPNLTTAPTAGQANESGCFDCN I CLDTAHDP WTLCG 
HLFCWPCIYKWLHVQLSSVSVDQHQNNCPVCKSNITITSLVPLYGRGMSSPSSTFGSKKQ 
DALSTDIPRRPAPSALRNPITSASSLNPSLQHQTLSPSFHNHQYSPRGFTTTESTDLANA 
VMMSFLYPVIGMFGDLVYTRIFGTFTNTIAQPYQSQRMMQREKSLNRVSIFFLCCIILCL 
LLF* 
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>G2069 (1..1026) 

ATGGAAGGAGGAGGAAGAGGACCAAATCAAACGATTCTCAGTGAAATAGAACATATGCCT 
GAAGCTCCACGTCAACGTATCTCTCATCACCGTCGAGCTCGCTCTGAAACCTTCTTCTCC 
GGCGAATCAATCGACGATCTCCTCTTATTCGATCCTTCCGATATCGATTTCTCTTCTCTA 
GACTTCCTCAACGCTCCACCACCACCACAACAATCACAACAACAACCGCAAGCTTCTCCC 
ATGTCCGTTGATTCGGAAGAAACCTCATCGAACGGTGTTGTTCCTCCTAATTCTCTTCCT 
CCAAAACCCGAAGCTAGATTCGGTCGCCATGTTCGTAGCTTCTCGGTTGATTCCGATTTC 
TTCGATGATTTGGGTGTTACTGAGGAGAAGTTTATAGCTACAAGTTCAGGAGAGAAGAAG 
AAAGGGAATCATCATCATAGCAGGAGTAATTCTATGGATGGAGAGATGAGTTCGGCGTCG 
TTTAATATCGAATCGATTTTAGCTTCTGTGAGTGGTAAAGATAGTGGGAAGAAGAATATG 
GGTATGGGTGGTGATAGACTTGCTGAGCTTGCTTTGCTTGATCCTAAAAGAGCTAAAAGG 
ATTTTAGCGAATAGACAATCTGCGGCGAGGTCGAAAGAGAGGAAGATTAGGTATACTGGT 
GAGTTAGAGAGGAAGGTTCAGACACTTCAGAATGAAGCTACTACATTGTCTGCTCAAGTC 
ACTATGTTACAGAGAGGAACATCAGAGCTGAACACTGAAAATAAACACCTCAAAATGCGG 
CTTCAAGCTTTAGAGCAACAAGCTGAACTTAGGGATGCTTTGAATGAAGCGCTGCGGGAT 
GAACTGAACCGACTTAAGGTGGTAGCTGGAGAAATTCCTCAGGGGAATGGAAATTCTTAC 
AACCGTGCTCAATTCTCATCTCAGCAATCGGCAATGAATCAGTTTGGGAACAAAACGAAC 
CAACAGATGAGTACAAACGGGCAGCCATCGCTCCCAAGCTACATGGATTTCACCAAGAGA 
GGCTGA 

>G2069 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEGGGRGPNQTILSEIEHMPEAPRQRISHHRRARSETFFSGESIDDLLLFDPSDIDFSSL 

DFLNAPPPPQQSQQQPQASPMSVDSEETSSNGWPPNSLPPKPEARFGRHVRSFSVDSDF 

FDDLGVTEEKFIATSSGEKKKGNHHHSRSNSMDGEMSSASFNIESILASVSGKDSGKKNM 

GMGGDRLAELALLDPKRAKRILAiraQSAARSKERK^ 

TMLQRGTS ELNTENKHLKMRLQALEQQAELRDALNEALRDELNRLKVVAG I PQGNGNS Y 

NRAQFSSQQSAMNQFGNKTNQQMSTNGQPSLPSYMDFTKRG* 

>G1852 (55.. 1857) 

CATCTGATCTGCTCTCGAAGACGAAAGCTTCGAGTACTGGTTGAAGCTAAAGCTATGGGA 
CACGTGAATCTACCTGCATCAAAGCGTGGTAACCCTCGTCAATGGCGTCTCCTCGACATC 
GTAACCGCTGCTTTCTTCGGTATCGTACTTCTCTTCTTCATCCTTTTATTCACTCCTCTT 
GGTGATTCCATGGCGGCTTCTGGTCGGCAAACGCTGCTTCTCTCTACGGCGTCAGATCCG 
AGGCAACGGCAGCGATTAGTGACTTTGGTTGAAGCTGGTCAGCATTTGCAACCGATCGAG 
TATTGTCCTGCGGAAGCTGTTGCTCATATGCCTTGTGAGGATCCGAGAAGGAATAGTCAG 
CTTAGTAGAGAGATGAATTTCTATAGGGAGAGACATTGTCCTTTGCCTGAGGAGACTCGG 
CTCTGTTTGATTCCTCCGCCTTCTGGTTATAAAATTCCTGTTCCGTGGCCTGAGAGTCTT 
CACAAGATTTGGCATGCAAACATGCCATATAACAAAATTGCTGACCGGAAAGGTCATCAA 
GGATGGATGAAAAGGGAAGGGGAATACTTTACTTTCCCAGGCGGTGGCACGATGTTTCCT 
GGCGGAGCTGGCCAATACATTGAAAAGCTTGCACAGTATATTCCGCTTAATGGTGGAACT 
TTGAGAACTGCTCTTGACATGGGATGCGGGGTAGCTAGTTTTGGAGGTACTCTACTATCT 
CAAGGCATTCTAGCCCTCTCATTTGCTCCAAGAGATTCACATAAATCTCAAATTCAGTTC 
GCTTTGGAAAGAGGAGTGCCTGCATTTGTTGCCATGCTTGGCACTCGTAGACTCCCCTTT 
CCTGCATACTCCTTTGACCTGATGCACTGTTCCCGATGTTTGATTCCTTTTACGGCTTAC 
AATGCAACTTACTTCATCGAAGTAGATAGGTTACTGCGCCCTGGAGGATATCTTGTAATC 
TCTGGCCCACCTGTACAATGGCCTAAACAAGACAAAGAATGGGCTGATCTTCAGGCGGTG 
GCTAGAGCTTTGTGCTATGAGCTAATTGCGGTTGATGGAAACACTGTCATCTGGAAGAAG 
CCTGTTGGAGATTCATGTCTACCTAGCCAGAATGAGTTTGGGCTTGAGTTGTGTGATGAG 
TCTGTTCCGCCAAGTGATGCATGGTATTTTAAATTGAAGAGGTGTGTTACCAGGCCATCA 
TCCGTCAAAGGAGAACACGCTTTGGGAACTATATCCAAGTGGCCGGAGAGGCTTACTAAA 
GTTCCTTCTAGGGCCATTGTCATGAAAAACGGATTGGATGTGTTTGAAGCAGATGCAAGG 
CGGTGGGCAAGACGCGTTGCTTATTACAGGGATTCTCTTAACTTGAAGCTGAAATCTCCA 
ACTGTCCGCAATGTCATGGACATGAACGCATTCTTCGGAGGCTTTGCAGCAACCCTTGCA 
TCTGATCCTGTGTGGGTTATGAATGTCATTCCAGCTCGGAAGCCATTAACTCTTGACGTG 
ATTTATGACAGAGGTCTCATCGGTGTTTACC^TGATTGGTGTGAACCATTTTCAACATAT 
CCCCGCACGTATGATTTCATCCATGTATCAGGAATTGAATCACTGATAAAACGACAAGAC 
TCAAGCAAATCGAGGTGTAGCCTAGTAGATCTAATGGTAGAGATGGACAGAATATTACGT 
CCAGAAGGAAAGGTTGTGATCCGAGACTCTCCTGAGGTGCTAGATAAAGTCGCACGAATG 
GCTCATGCTGTAAGATGGTCTTCTTCCATACACGAGAAAGAACCTGAATCCCATGGAAGA 
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GAGAAGATTCTTATCGCAACCAAATCTCTCTGGAAATTGCCATCAAACTCCCACTGAAGA 
CACAAAAGAAGAAGAAAAGAAGAAGCTCTTCTCAATCTTGTAGGTACTGTCACTTGCTCT 
CCAGCCC . 

>G1852 Amino Acid Sequence (domain in AA coordinates: 1-601) 

MGHVNLPASKRGNPRQWRLLDIVTAAFFGIVLLFFILLFTPLGDSMAASGRQTLLLSTAS 

DPRQRQRLVTLVEAGQHLQPIEYCPAEAVAHMPCEDPRRNSQLSREMNFYRERHCPLPEE 

TPLCLIPPPSGYKIPVPWPESLHKIWHANMPYNKIADRKGHQGWMKREGEYFTFPGGGTM 

FPGGAGQYIEKLAQYIPLNGGTLRTALDMGCGVASFGGTLLSQGILALSFAPRDSHKSQI 

QFALERGVPAFVAMLGTRRLPFPAYSFDLMHCSRCLIPFTAYNATYFIEVDRLLRPGGYL 

VISGPPVQWPKQDKEWADLQAVARALCYELIAVIX3NTVIWKKPVGDSCLPSQNEFGLELC 

DESVPPSDAWYFKLKRCVTRPSSVKGEHALGTISKWPERLTKVPSRAIVMKNGLDVFEM 

ARRWARRVAYYRD SLNLKLKS PTVRNVI^MNAFFGGFAATLASDPVWVMNVI PARKPLTL 

DVIYDRGLIGVYHDWCEPFSTYPRTYDFIHVSGIESLIKRQDSSKSRCSIiVDLMVEMDRI 

LRPEGKVVIRDSPEVLDKVARMAHATOWSSSIHEKEPESHGREKILIATKSLWKLPSNSH 

* 

>G1793 (59.. 1783) 

AGTGATTTATTGATTAACCCAAACACAAAATAAACAGATTTGACTCAAAAAGAAGAAAAT 
GAATTCTAACAACTGGCTTGGCTTTCCTCTTTCACCGAACAACTCTTCTTTGCCTCCTCA 
TGAATACAACCTTGGCTTGGTCAGCGACCATATGGACAACCCTTTTCAAACACAAGAGTG 
GAATATGATCAATCCACACGGTGGAGGAGGAGATGAAGGAGGAGAGGTTCCAAAAGTGGC 
CGATTTTCTCGGTGTGAGCAAACCGGACGAAAACCAATCCAACCACCTAGTAGCTTACAA 
CGACTCAGACTACTACTTCCATACCAATAGCTTGATGCCTAGCGTCCAATCAAACGATGT 
CGTTGTAGCAGCTTGTGACTCCAATACTCCTAACAACAGTAGCTATCATGAGCTTCAAGA 
GAGTGCTCACAATCTACAGTCACTTACTTTGTCCATGGGGACCACCGCTGGTAATAATGT 
TGTAGACAAAGCTTCACCATCCGAGACCACCGGGGATAACGCTAGCGGTGGAGCACTAGC 
CGTTGTTGAGACGGCCACGCdAAGACGTGCATTGGACACTTTCGGACAACGAACCTCGAT 
CTATCGTGGTGTCACAAGACATCGATGGACTGGTCGATATGAGGCTCATCTATGGGATAA 
TAGTTGTAGAAGGGAAGGCCAGTCTAGGAAAGGAAGACAAGTTTACTTGGGTGGATATGA 
CAAAGAAGATAAAGCAGCAAGATCATATGATCTAGCTGCACTTAAGTACTGGGGTCCTTC 
AACTACTACTAATTTCCCCATTACAAACTACGAGAAAGAAGTAGAGGAAATGAAGCACAT 
GACGAGACAAGAGTTCGTGGCTGCCATTAGAAGGAAAAGTAGTGGATTTTCGAGAGGCGC 
TTCGATGTATCGAGGAGTTACAAGGCATCACCAACATGGAAGATGGCAAGCAAGGATCGG 
CCGAGTCGCCGGAAACAAAGACCTCTACTTGGGAACTTTTAGCACTGAGGAAGAAGCAGC 
AGAAGCTTACGATATAGCTGCAATAAAGTTTAGAGGACTTAATGCAGTGACCAACTTCGA 
GATCAACCGGTACGACGTGAAAGCCATTCTAGAGAGTAGCACTCTTCCCATCGGAGGAGG 
CGCAGCTAAACGGCTCAAAGAAGCTCAAGCTCTTGAGTCTTCAAGGAAACGCGAGGCGGA 
GATGATAGCCCTTGGTTCAAGTTTCCAGTACGGTGGTGGCTCGAGCACAGGCTCTGGCTC 
CACGTCATCAAGACTTCAGCTTCAACCTTACCCTCTAAGCATTCAACAACCATTAGAGCC 
TTTTCTATCTCTTCAGAACAATGACATCTCTCATTACAACAACAACAATGCTCACGATTC 
CTCCTCTTTTAATCACCATAGCTATATCCAGACACAACTTCATCTCCACCAACAGACCAA 
CAATTACTTGCAGCAACAGTCGAGCCAGAACTCTCAGCAGCTCTACAATGCGTATCTTCA 
TAGCAATCCGGCTCTGCTTCATGGACTTGTCTCTACCTCTATCGTTGACAACAATAATAA 
CAATGGAGGCTCTAGTGGGAGCTACAACACTGCAGCATTTCTTGGGAACCACGGTATTGG 
TATTGGGTCCAGCTCGACTGTTGGATCGACCGAGGAGTTTCCAACCGTTAAAACAGATTA 
CGATATGCCTTCCAGTGATGGAACCGGAGGGTATAGTGGTTGGACCAGTGAGTCTGTTCA 
GGGGTCAAACCCTGGTGGTGTTTTCACTATGTGGAATGAGTAAACAAGGATCTCTTTCTT 
GCGGCACAAGGAATGGGT 

>G1793 Amino Acid Sequence (conserved domain in AA coordinates : 179-255 , 281-349) 

MNSNNWLGFPLSPNNSSLPPHEYWLGLVSDHMDNPFQTQEWNMINPHGGGGDEGGEVPKV 

ADFLGVSKPDENQSlJlHLVAYNDSDYYFHTO 

ESAH^TLQSLTLSMGTTAGNlW^KASPSETTGDNASGGALAvv^TATPRRALDTFG 

I YRG VTRHRWTGR YEAHLWDNS CRREGQS RKGRQVYLGG YDKEDKAARS YDLAALKYWG P 

STTTNFP I TNYEKE VEEMKHMTRQEFVAAI RRKS SGF SRGASMYRGVTRHHQHGRWQARI 

GRVAGNKDLYLGTFSTEEEAAEAYDIAAIKFRGLNAVTNFEINRYDVKAILESSTLPIGG 

GAAKRLKEAQALESSRKREAEMIALGSSFQYGGGSSTGSGSTSSRLQLQPYPLSIQQPLE 

PFLSLQ^^^roISHYNNlTOAHDSSSFNHHSYI 

HSNPALLHGLVSTSIVDN1INNNGGSSGSYNTAAFLGNHGIGIGSSSTVG 
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YDMPSSDGTGGYSGWTSESVQGSNPGGVFTMWNE* 
>G761 (521. .1549 ) 

GGGGCCGACCGGCCGCCCGGGCAGGTCTAGGTTCAAAAGGACTCACAAGAGAGAGATAGT 
ATGATTGATAGGGAAAGAGAGAGAGATGAAAGAAAGTAAAATATATAATAGATTATTAGG 
ACACGAGTGT(^TCTTTTGATTTGTGTCTTGTGTGOTCTCTCTTTCTTCTCTTCCTCGAA 
TGATCATCTTTATATAACCCTACTCTCTTTCTCTTTTCCCATTCTTTCATATCATTCTCC 
CTTTCTCTCTCGGGATCTGATCTCTCTTTCCAGTAACCTATTCCCGAGGAGCACTGTCAA 
ATCTTGTCCACTCTTTGATCTTATCTCGATCTCTTTCTCTTTCTAGTCTTGTGTAGTCTT 
CAAACTTGTGATGTTATCTATATAGTAATCACGAGAGAGAATCATACAATAGCTGAAACA 
TAAAGCTTTCTTAGAAGCTTTAAAAAGGTCTCATCTGGATTATCCTGTTTAATTTCTAGA 
GTTTCTTC^GGCAGATTATTAACCGATCAAGAA 

CCCTCCGGGTTTTAGATTTCACCCGACAGATGAAGAACTTGTAGACTACTACCTGAGGAA 

AAAAGTCGCATCGAAGAGAATAGAAATTGATTTCATAAAGGACATTGATCTTTACAAGAT 

TGAGCCATGGGACCTTCAAGAGTTGTGCAAAATTGGGCATGAAGAGCAGAGTGATTGGTA 

CTTCTTTAGCCATAAAGACAAGAAGTATCCCACAGGGACTCGAACCAATAGAGCAACAT^ 

AGCAGGGTTTTGGAAAGCCACCGGAAGAGATAAGGCTATCTATTTGAGGCATAGTCTAAT 

TGGCATGAGGAAAACACTTGTGTTTTACAAGGGAAGAGCCCCAAATGGACAAAAGTCTGA 

TTGGATCATGCACGAATACCGCTTAGAAACCGATGA7\AACGGAACTCCTCAGGAAGAAGG 

ATGGGTTGTGTGTAGGGTTTTCAAGAAGAGATTGGCTGCAGTTAGACGAATGGGAGATTA 

CGACTCATCCCCTTCACATTGGTACGATGATCAACTTTCTTTTATGGCCTCCGAGCTCGA 

GACAAACGGTCAACGACGGATTCTCCCCAATCATCATCAGCAGCAGCAGCACGAGCACCA 

ACAACATATGCCATATGGCCTCAATGCATCTC 

ATGCAAGCAAGAGCTAGAACTACACTACAACGACCTGCT^ 

ACAATTGAATCAAGGAAATCAGAACTTCAGCTCTCTATACATGAACAGCGGCAACGAGCA 
AGTGATGGACCAAGTCACAGACTGGAGAGTTCTCGATAAATTTGTTGCTTCTCAGCTAAG 
CAACGAGGAGGCTGCCACAGCTTCTGGATCTATACAGAATAATGCCAAGGACACAAGCAA 
TGCTGAGTACCAAGTTGATGAAGAAAAAGATCCGAAAAGGGCTTCAGACATGGGAGAAGA 
ATATACTGCTTCTACTTCTTCGAGTTGTCAGATTGATCTATGGAAGTGAGCTGAAAGAGA 
AGACATATAAATGCATATATACATATATATATATACGTACACACGAACACT7UITCAAGTG 
TAGATGATGATGATGGTACAGATTTATATTTGCTTTGATTGATTCTTACTACATTATTGA 
ACTTATGTCATATGCATATATACATTGCGTATCTATGCATATTTATACTTGTACTCAATA 
TGATTAACCATATATAAACTCTAATCTAAATGTAACTCCAATATTTTTTAAATAGACAAT 
TGTCTCTTCTTATTAGAAAAAAAA 

>G761 Amino Acid Sequence (domain in AA coordinates: 10-156) 
MNS FSHVPPGFRFHPTDEELVD YYLRKKVAS KR I E IDF I KD IDL YKI EPWDLQELCKI GH 
EEQSDWYFFSHKDKKYPTGTRTNRATKAGFWKATGRDK^^ 

PNGQKSDWIl^EYRLETDENGTPQEEGV^CRWKKRI^VRRMGDYDSSPSHWYDDQLS 
FMASELETNGQRRILPNHHQQQQHEHQQHMPYGLNASAYALNNPNLQCKQELELHYNH 
SNIAHEEQLNQGNQNFSSLYMNSGNEQVMDQVTDWRVLDKFVASQLSNEEAATASASIQN 
NAKDTSNAEYQVDEEKDPKRASDMGEEYTAS TS S S CQIDLWK* 
>G1056 (10.. 798) 

GCTACATATATGGGTTCTATTAGAGGAAACATTGAAGAGCCTATATCTCAGTCATTAACG 
AGGCAGAACTCTCTCTATAGCTTAAAGCTCCATGAGGTTCAAACCCACTTAGGAAGTTCT 
GGAAAACCACTAGGAAGCATGAACCTTGATGAGCTTCTCAAGACTGTCTTGCCACCAGCT 
GAGGAAGGGCTTGTTCGTCAGGGAAGCTTGACGTTACCTCGAGATCTCAGTAAAAAGACA 
GTTGATGAGGTCTGGAGAGATATCCAACAGGACAAGAATGGAAACGGTACTAGTACTACT 
ACTACTCATAAGCAGCCTACACTCGGTGAAATAACACTTGAGGATTTGTTGTTGAGAGCT 
GGTGTAGTGACTGAGACAGTAGTCCCTCAAGAAAATGTTGTTAACATAGCTTCAAATGGG 
CAATGGGTTGAGTATCATCATCAGCCTCAACAACAACAAGGGTTTATGACATATCCGGTT 
TGCGAGATGCAAGATATGGTGATGATGGGTGGATTATCGGATACACCACAAGCGCCTGGG 
AGGAAAAGAGTAGCTGGAGAGATTGTGGAGAAGACTGTTGAGAGGAGACAGAAGAGGATG 
ATCAAGAACAGAGAATCTGCAGCACGTTCACGAGCTAGGAAACAGGCTTATACACATGAA 
TTAGAGATCAAGGTTTCAAGGTTAGAAGAAGAAAACGAAAAACTTCGGAGGCTAAAGGAG 
GTGGAGAAGATCCTACCAAGTGAACCACCACCAGATCCTAAGTGGAAGCTCCGGCGAACA 
AACTCTGCTTCTCTCTGATCCTAAAGACTCTTCTTTCTTTCTTCTTCTTTGTGTTGGTTT 
ATATCAGACCGCTTTGTTCTTTGTATATTGTGTAGACTTTATTGACTTTGAACAGCATGT 
CTTTATAAACATTTCTTGAGTGT 
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>G1056 Amino Acid Sequence (domain in AA coordinates 183-246) 
MGSIRGNIEEPISQSLTRQNSLYSLKLHEVQTHLGSSGKPLGSMNLDELLKTVLPPAEEG 
LVRQGSLTLPRDLSKKTWEWRDIQQDKNGNGTSTTTTHKQPTLGEITLEDLLLRAGVV 
TETVVPQENVWIASNGQWTCYHHQPQQQQGFMTYPVCEM 

VAGE I VE KTVERRQKRM I KNRE S AARS RARKQAYTHELE I KVSRLEEENE KLRRL KEVEK 

ILPSEPPPDPKWKLRRTNSASL* 

>G1447 (82.. 1086) 

AAAAACCCTAACCCTAATTCTCTCAAGACAACTCAAAGGTCTCTCCTTTTTTAGGTTTAT 
TATCACTTCCGTATAATCGCCATGTCTTCTCTACCATGGAAAAAACCAAAATCGAGTCGA 
ATCTTAAGATTCATTTCTGAGTTTCAACAATCACCGTTCGTTGAT^CTGGCTTTCCAACT 
TCTCTGATCGATCTCTTCTTCAAGAATCGCGATCGTCTAAAAAAATCTCCATCTAAACGC 
TTCCAACGAATCGAACGCCAGATTCGAACCGCTCCAAACGCTTCTTCGTTGAGTAATCAA 
GATACGATTTTTGAAAAGCCCTCGAGGATTAAAACCGTTCGAAGTAAGGTCGAGAAAGTT 
AATTGCGTTAAAGGTAAATCAGCGGCGTTGAAGAAGAACGCGATTAAAAATAGCGTTTTC 
GGCGGTAGCGGTGAGGTCGTTTTGATGGCGTTTAAGGTTTTGATAGTAGCGTTGCTCGCC 
TTGAGCACGAAGAAGAAGCTGACTTTAGGAATGACTCTCTCTGCCTTCGCTCTTCTCTTA 

GCGATTGCCCGCGAGAAT^ATCGAAACTTTTGATGAAACTCGAGTTCCCAAAGCGATTCCA 
TGTCCTGAGGAAACAGAGCATGTAGTATCTGAAACAGAGGTTTCGAAGTTGAAAGGTTTA 
ACGATACGTGATCTGTTGTCAAAGGACGAGAAATCAACAAGTAAAAGTTGGAGACTAAAA 
TCGAAGATTGTGAAGAAGTTGAGGAGTTACAATAAGAAGGATAAGAAGACGATGAAGATC 
AAAGAAGAGTCTTTGATTGAAGTCTCGAGTTTGGTTTTAGAAGATAT^ACCAAAGAAAATT 
GAGTCTGAGAGAGACGAAGAAGAAACGTTGAATCCTCCAGTGGTTGGATCAAACCTGAAT 
GGGATTGTTCTGATCGTGATTGTGCTAACCGGTTTGTTATGTGGGAAGGTCTTAGCTATT 
GTTCTGACACTATCATGTTTGGTTCTTAGATTAGGAGCAGTCAAAAAAGTTAATCTTTGC 
ATATAATTTTTTTTGTATTTTTTAACATGCTTGCATGTGAAACTGTAAATTTTTCTCATT 
CATATGAAGGAGATTGGATTGAATGTTGAATACTAAA 

>G1447 Amino Acid Sequence (domain in AA coordinates: 3-54, 124-156) 
MSSLPWKKPKSSRILRPISEFQQSPFVETGFPTSLIDLFFKNRDRLKKSPSKRFQRIERQ 
IRTAPNASSLSNQDTIFEKPSRIKTVRSKOTKVNCVKGKSAALKKNAIKNSVFGG 
LMAFKVLIVALLALSTKKKLTLGITL^ 

ETFDETRVPKAIPCPEETEHWSETEVSKLKGLTIRDLLSKDEKSTSKSWRLKSKIVKKL 
RSYNKKDKKTMKIKEESLIEVSSLVLEDKPKKIESERDEEETLNPPWGSNLNGIVLIVI 
VLTGLLCGKVLAIVLTLSCLVLRLGAVKKVNLCI * 
>G323 (77.. 826) 

CTGCTCATATCAGCCATTGACACAGTTGCTTTGGGTTTCCCTCAAACGGCGCCGATTGTC 
TGGATTTTGACCACTGATGGCCTTAGATCAATCTTTTGAAGATGCTGCTTTACTTGGAGA 
ACTCTATGGAGAAGGTGCATTTTGTTTCAAGAGCAAGAAACCTGAACCCATTACAGTCTC 
GGTTCCTTCTGATGATACTGATGATTCGAATTTTGACTGCAATATTTGCTTAGACTCGGT 
GCAAGAACCTGTTGTGACTCTCTGTGGTCACCTCTTTTGCTGGCCTTGTATTCACAAATG 
GCTTGATGTACAGAGCTTCTCAACAAGTGATGAATACCAAAGACATAGACAGTGTCCTGT 
TTGTAAATCTAAAGTTTCTCATTCTACTTTGGTTCCTTTGTATGGTAGAGGCCGTTGTAC 
TACTCAGGAGGAAGGTAAAAACAGTGTGCCTAAAAGACCCGTAGGACCGGTTTATCGGCT 
TGAAATGCCGAATTCACCTTATGCAAGTACTGATCTGCGGTTATCACAACGGGTTCATTT 
CAATAGCCCACAGGAAGGTTACTACCCTGTCTCAGGGGTGATGAGCTCGAACAGTTTATC 
ATACTCTGCTGTTTTGGATCCGGTGATGGTGATGGTTGGAGAAATGGTAGCTACGAGGTT 
GTTTGGAACACGAGTGATGGATAGATTTGCGTATCCGGACACTTACAATCTCGCAGGGAC 
TAGCGGGCCGAGGATGAGAAGGCGGATAATGCAGGCAGATAAATCGCTGGGAAGAATCTT 
CTTCTTCTTTATGTGTTGTGTTGTTCTGTGTCTTCTCTTGTTTTAGGTTTTCATAGCTAG 
CTTGGTTCTGCTACTGTTCAGTTTCTTCAGG 

>G323 Amino Acid Sequence (conserved domain in AA coordinates : 4 8-96) 

MALDQSFEDAALLGELYGEGAFCFKSKKPEPITVSVPSDDTDDSNFDCNICLDSVQEPVV 

TLCGHLFCWPCIHKWLDVQSFSTSDEYQRHRQCPVCKSKVSHSTLVPLYGRGRCTTQEEG 

KNSVPKRPVGPVYRLEMPNSPYASTDLRLSQRVHFNSPQEGYYPVSGVMSSNSLSYSAVL 

DPVMV1WGEMVATRLFGTRVMDRFAYPDTYNLAGTSGPRMRRRIMQ 

CWLCLLLF* 

>G176 (41.. 1606) 
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AGAAGAAGAAGAAGAAGAGTACCTCATACGTAAACCATTGATGGGCTCTTTTGATCGCCA 
AAGAGCTGTTCCGAAATTCAAAACAGCAACACCGTCACCGCTCCCTCTTTCTCCTTCGCC 
TTACTTCACTATGCCTCCTGGCCTTACTCCCGCCGACTTTCTCGACTCTCCTCTTCTCTT 
CACTTCCTCCAACATTTTGCCGTCTCCTACGACAGGCACATTTCCAGCGCAATCTCTGAA 
CTATAACAATAACGGTTTGCTCATTGACAAAAATGAAATCAAATATGAAGACACAACTCC 
TCCCTTGTTCCTACCATCTATGGTAACTCAGCCTTTACCTCAACTGGATTTATTCAAATC 
CGAAATCATGTCGAGTAACAAAACCTCTGATGACGGCTACAATTGGCGCAAATACGGGCA 
GAAGCAAGTCAAAGGAAGCGAAAACCCGAGGAGTTACTTCAAATGCACGTATCCAAATTG 
TCTCACAAAGAAGAAAGTAGAGACGTCTCTTGTGAAGGGTCAGATGATTGAGATTGTCTA 
TAAAGGAAGCCACAATCATCCCAAGCCCCAATCCACGAAGCGATCATCTTCCACCGCTAT 
AGCAGCACATCAGAACAGCAGTAATGGAGACGGTAAAGACATTGGTGAAGATGAAACAGA 
GGCCAAGAGATGGAAAAGAGAAGAGAATGTGAAGGAGCCAAGAGTGGTGGTTCAGACAAC 
AAGTGATATAGACATTCTTGACGATGGCTACAGATGGAGAAAGTATGGTCAGAAAGTCGT 
CAAGGGTAATCGAAATCGAAGGAGCTATTAGAAGTGCA 

GAAACACGTTGAAAGAGCATTTCAAGATCCCAAGTCAGTGATCACAACTTACGAAGGAAA 
ACACAAACACCAAATCCCGACCCCAAGAAGAGGTCCAGTTTTAAGATCTGCTGCAATGGC 
TTCTCCTCTTCTCCCAACTTCGACT^ 

GCTGAGCTCTCTACGCGTCCTCTTGTCCCGCGTTCTAGCCACCGTCCGTCACGCTTCTGC 

AGATGCCAGACCCTGGGCAGAGCTCGTTGACCGGTCAGCGTTTTCCCGGCCACCATCGCT 

CTCGGAGGCAACGTCACGAGTAAGGAAGAACTTTTCCTATTTCCGAGCCAATTACATAAC 

CTTAGTGGCAATCTTACTCGCCGCGTCTCTGCTCACGCACCCTTTCGCTCTCTTCCTCCT 

CGCATCGCTGGCCGCTTCITGGCTTTTCCTCTACTTTTTCCGTCCGG 

GGTCATTGGAGGACGCACGTTCTCCGATCTTGAGACGCTAGGGATACTCTGCCTGTCCAC 

TGTGGTGGTGATGTTCATGACCAGCGTTGGATCGCTCTTGATGTCCACTCTAGCAGTTGG 

GATCATGGGCGTGGCCATCCACGGAGCGTTTCGTGCTCCCGAAGACCTGTTTCTTGAAGA 

ACAAGAAGCCATTGGATCTGGACTTTTCGCATTCTTCAACAACAATGCCTCTAATGCAGC 

TGCCGCTGCCATAGCCACCTCAGCAATGTCACGCGTTCGAGTCTGAGATTGTTGAAGAGA 

CTACATTCCTACACCG<^TTTCCAAAGTGTGATATTrATTCATATT 

>G176 Amino Acid Sequence (domain in AA coordinates: 117-173 , 234-29X)) 

MGSFDRQRAVPKFKTATPSPLPLSPSPYFTMPPGLTPADFLDSPLLFTSSNILPSPTTGT 

FPAQSLNYNNNGLLIDKNEIKYEDTTPPLFLPSMVTQPLPQIJDL 

NWRKYGQKQVKGSENPRSYFKCTYPNCLTKKKVETSLVKGQMIEIVYKGSHNHPKPQSTK 

RS S STAI AAHQNS SNGDGKDI GEDETEAKRWKREENVKEPRVWQTTSD IDI LDDGYRWR 

KYGQKVVKGNPNPRSYYKCTFTGCFWKHVERAFQDPKSVITTYEGKHKHQIPTPRRGPV 

LRSAAMASPLLPTSTTPDQLPGGDPQLLSSLRVLLSRVLATVRHASADARPWAELVDRSA 

FSRPPSLSEATSRVRKNFSYFRANYITLVAILLAASLLTHPFALFLLASLAASWLFLYFF 

RPADQPLVIGGRTFSDLETLGILCLSTVVVMFiyrrSVGSLLMSTIiAVGIMGVAIHGAFRAP 

EDLFLEEQEAIGSGLFAFFNNNASNAAAAAIATSAMSRVRV* 

>G174 (194.. 1585) 

CCCAATTTGAGATTGTTCGATTTCGATCTACGAGATTCTTACAAGAACATAAGCAGCTTC 
GGTTTTTTGGGATTATCTTATTTGGTCGGATGATGATCTTCTCGATGTCTGTGCTAGGCT 
TTGGGAATTAGATATATTTGGGGTTAAGCTCGAGTCTCTCCGGTTTTGAGTTTACTTGAG 
TTTGTTAGTATTTATGGCTGAGGTGGGAAAAGTTCTGGCTAGTGATATGGAGTTAGACCA 
TTCAAATGAGACTAAAGCAGTGGATGATGTTGTTGCCACTACTGATAAAGCGGAGGTCAT 
ACCAGTGGCTGTAACTAGAACTGAAACCGTTGTTGAAAGTTTGGAATCTACTGACTGTAA 
GGAGCTTGAA7VAACTTGTTCCACATACGGTAGCTTCGCAGTCGGAAGTAGATGTTGCTTC 
CCCGGTATCCGAGAAAGC^CCGAAGGTTTCTGAAAGTAGCGGTGCATTATCTTTGCAGTC 
TGGTTCGGAAGGGAATAGTCCTTTTATTCGTGAGAAGGTTATGGAAGACGGATACAACTG 
GCGGAAATATGGACAGAAACTTGTGAAAGGAAATGAGTTTGTAAGGAGCTATTACAGGTG 
CACTCACCCTAACTGCAAAGCGAAAAAACAGTTGGAACGGTCTGCGGGTGGACAAGTCGT 
GGATACCGTTTACTTTGGGGAACATGATCACCCAAAGCCTCTTGCTGGTGCTGTTCCTAT 
CAATCAGGATAAGCGAAGTGATGTCTTCACAGCTGTTAGTAAAGAGAAAACATCTGGATC 
CAGTGTTC^GACACTTCGTCAAACCGAACCACCAAAGATCCATGGAGGATTACATGTTTC 
AGTTATTCCACCAGCTGATGATGTGAAAACTGATATTTCACAATCAAGTAGGATAACGGG 
GGACAACACTCACAAGGATTATAATAGTCCTACCGCAAAGCGAAGGAAGAAAGGAGGGAA 
CATTGAGCTGAGTCCAGTGGAGAGGTCAACCAATGATTCACGCATTGTGGTTCACACTCA 
GACTCTGTTTGATATTGTGAATGATGGGTACCGATGGCGTAAATATGGTCAGAAATCAGT 
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AAAAGGCAGCCCATATCCAAGGAGCTACTATAGATGTTCAAGCCCTGGATGCCCCGTCAA 
GAAACACGTAGAGAGGTCATCTCATGACACAAAGTTGCTTATAACAACTTACGAGGGAAA 
ACACGACCACGATATGCCTCCAGGAAGAGTTGTTACTCATAATAACATGCTGGACTCGGA 
AGTTGATGATAAAGAAGGAGATGCCAACAAGACTCCACAGAGCTCAACTCTTCAATCCAT 
TAC AAAAGAC CAG CATGTCGAAGATCACTTAAGAAAG AAAACGAAGACTAATGGCTTTGA 
GAAAAGTCTTGATCAAGGTCCAGTTTTGGATGAGAAGCTGAAGGAGGAAATAAAAGAGAG 
ATCAGATGCAAACAAAGATCACGCAGCCAATCACGCCAAGCCGGAAGCAAAGTCAGATGA 
TAAAACCACTGTTTGTCAAGAGAAGGCAGTAGGAACCCTGGAGAGCGAGGAACAAAAACC 
CAAGACAGAGCCTGCCCAAAGCTAAGCATTCAGTGTTGTACCGAGTGGTAATTTATATGG 
CTGTTTTAACATAGATTAGTAC7VGGCGATATGGTTATAGACTGTACAGTTGTTGTTCAGG 
CGGGACCAGATTTAGATTAGTGTTTAATGGAATAGTATGCTTTAATACCTTTATGTAACC 
ACTTCCATTTGGTTCAAATAAGAGTTACAGGAAGAGAAGGTAACACAACAAGAGCCCTTC 
TTTGTTGATGGAGCCTGTGTAATAGTTGTAGCATGGGGATGTATATGATTTGATTCAACC 
TTATTAATGGTTATGAGACAAAACTATC 

>G174 Amino Acid Sequence (domain in AA coordinates: TBD) 
MAEVGK\^SDMELDHSNETKAVDDWATTDKAEVIPVAVTRTETVVESLESTDCKELEK 
LVPHTVASQSEVDVASPVSEKAPKVSESSGALSLQSGSEGNSPFIREKVMEDGYNWRKYG 
QKLVKGNEFVRSYYRCTHPNCKAKKQLERSAGGQVVDTW 

RSDVFTAVSKEKTSGSSVQTLRQTEPPKIHGGLHVSVIPPADDVKTDISQSSRITGDNTH 

KDYNSPTAKRRKXGGNIELSPVERSTNDSRIVWTQTLFDIVNDGYRWRKYGQKSVKGSP 

YPRSYYRCSSPGCPVKKHVERSSHDTKLLITTYEGKHD^ 

EGDANKTPQSSTLQSITKDQHVEDHLRKKTKTN^ 

KDHAANHAKPEAKSDDKTTVCQEKAVGTLESEEQKPKTEPAQS * 

>G715 (1..705) 

ATGGATACCAACAACCAGCAACCACCTCCCTCCGCCGCCGGAATCCCTCCTCCACCACCT 

GGAACCACCATCTCCGCCGCAGGAGGAGGAGCTTCTTACCACCACCTTCTCCAACAACAA 

CAACAACAGCTCCAACTATTCTGGACCTACCAACGCCAAGAGATCGAACAAGTTAACGAT 

TTCAAAAACCATCAGCTTCCACTAGCTAGGATAAAAAAGATCATGAAAGCCGATGAAGAT 

GTTCGTATGATCTCCGCAGAAGCACCGATTCTCTTCGCGAAAGCTTGTGAGCTTTTCATT 

CTCGAGCTCACGATCAGATCTTGGCTTCACGCTGAGGAGAATAAACGTCGTACGCTTCAG 

AAAAACGATATCGCTGCTGCGATTACTAGGACTGATATCTTCGATTTCCTTGTTGATATT 

GTTCCTAGAGATGAGATTAAGGACGAAGCCGCAGTCCTCGGTGGTGGAATGGTGGTGGCT 

CCTACCGCGAGCGGCGTGCCTTACTATTATCCGCCGATGGGACAACCAGCTGGTCCTGGA 

GGGATGATGATTGGGAGACCAGCTATGGATCCGAATGGTGTTTATGTCCAGCCTCCGTCT 

CAGGCGTGGCAGAGTGTTTGGCAGACTTCGACGGGGACGGGAGATGATGTCTCTTATGGT 

AGTGGTGGAAGTTCCGGTCAAGGGAATCTCGACGGCCAAGGGTAA 

>G715 Amino Acid Sequence (domain in AA coordinates: 60-132) 

MDTNNQQPPPSAAGIPPPPPGTTISAAGGGASYHHLLQQQQQQLQLFWTYQRQEIEQVND 

FKNHQLPLARIKKIMKADEDVRMISAEAPILFAKACELFILELTIRSWLHAEENKRRTLQ 

KNDIAAAITRTDIFDFLVDIVPRDEIKDEAAVLGGGMWAPTASGVPYYYPPMGQPAGPG 

GMMIGRPAMDPNGVYVQPPSQAWQSVWQTSTGTGDDVSYGSGGSSGQGNLDGQG* 

>G588 (196.. 1599) 

ATCTGAAGTGAACCAAGCTCAGGTTTTGTCTTCTCTTTGATCATTCCTTTCTCAGCAATA 
TAAATTAGAGTTATATCCTTTATAAAGGATTTTGCTTTTTCACCAACAAACCCTAAATTC 
GGTGTCTCAGCAAGAATCACGTGATTCTCGTTCCTCTTCCTCACGAAACCCATCATCTTC 
TATCTCATTTGAGAAATGGGTCAAAAGTTTTGGGAGAATCAAGAAGATCGAGCGATGGTT 
GAATCCACCATAGGCTCTGAAGCTTGCGACTTTTTCATCTCAACAGCTTCAGCTTCCAAC 
ACTGCCTTGTCCAAGCTTGTCTCACCACCAAGTGATTCCAATCTCCAACAAGGGTTACGT 
CACGTTGTTGAAGGATCTGATTGGGATTATGCTCTTTTCTGGCTAGCGTCCAACGTTAAT 
AGCTCTGATGGTTGTGTCTTGATCTGGGGAGATGGTCATTGCCGTGTCAAAAAGGGTGCT 
TCAGGTGAGGATTACTCTCAGCAAGATGAGATCAAAAGACGTGTGCTTCGCAAGCTTCAC 
TTGTCGTTCGTTGGTTCAGATGAAGATCATCGTTTGGTGAAATCAGGAGCTCTTACTGAT 
CTCGACATGTTTTATCTGGCTTCTTTGTACTTTTCCTTTAGGTGTGATACCAATAAGTAC 
GGTCCTGCTGGAACCTATGTGTCTGGGAAGCCTCTTTGGGCTGCAGATTTGCCTAGCTGC 
TTGAGTTATTATAGGGTTAGGTCTTTCTTAGCTAGGTCAGCTGGTTTTCAGACTGTGTTG 
TCTGTACCAGTGAATTCTGGAGTTGTGGAGCTTGGTTCTTTAAGACATATTCCAGAAGAT 
AAGAGTGTGATTGAGATGGTGAAATCAGTGTTTGGTGGGTCTGACTTTGTTCAGGCTAAA 
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GAAGCTCCTAAAATCTTTGGTCGACAGCTGAGTCTTGGTGGAGCAAAACCTCGGTCTATG 
AGTATTAATTTCTCCCCGAAGACCGAGGATGACACGGGTTTCTCATTGGAATCGTATGAG 
GTGCAAGCGATCGGAGGCTCTAATCAAGTGTATGGTTATGAGCAAGGGAAAGATGAGACA 
TTGTATCTAACTGACGAGCAAAAGCCGAGGAAGAGAGGGAGAAAACCAGCAT^ATGGAAGA 
GAAGAGGCTCTAAACCATGTGGAAGCGGAACGGCAGAGGAGGGAGAAGCTGAACCAGAGA 
TTCTACGCTTTGAGAGCGGTGGTGCCTAACATCTCCAAGATGGACAAGGCTTCGCTCCTT 
GCAGACGCAATCACTTACATCACGGATATGCAGAAGAAAATCAGGGTGTATGAAACAGAG 
AAGCAGATAATGAAGAGGAGGGAGAGTAATCAGATAACTCCAGCAGAGGTTGATTATCAA 
CAGAGGCATGATGATGCAGTTGTAAGGCTAAGCTGTCCGTTGGAAACTCATCCAGTTTCA 
AAGGTGATACAAACGTTGAGGGAGAATGAAGTTATGCCTCATGATTCCAACGTGGCCATC 
ACAGAGGAGGGTGTGGTTCACACATTCACTCTCCGGCCTCAGGGTGGCTGCACCGCTGAG 
CAGTTGAAGGACAAGCTCCTTGCCTCTCTATCACAGTAACTATCACAGCAGTAACTGCTA 
TGTAATAAGTGTAACCGTGTTGGAGGTTGTATCAATGTACTATTGCAAGCCAACCAAAAA 
AAACTCCAGCTTAGTAGGATCGTGTAATTTTCCTTATATGTAATGTTGAGATTTGTCTTT 
TACATATAAAGATTTGA 

>G588 Amino Acid Sequence (domain in AA coordinates: 309-376) 

MGQKFWENQEDRAMVESTIGSEACDFFISTASASNTALSKLVSPPSDSNLQQGLRHVVEG 

SDWDYALFWLASNVNSSIXSCVLIWGDGHCR^ 

SDEDHRLVKSGALTDLDMFYLASLYFSFRCDTNKYGPAGTYVSGKPLWAADLPSCLSYYR 

WSFI^SAGFQTVLSVPVNSGVVELGSLRHIPEDKSVIEMVKSVFGGSDFVQAKEAPKI 

FGRQLSLGGAKPRSMSINFSPKTEDDTGFSLESYEVQAIGGSNQVYGYEQGKDETLYLTD 

EQKPRKRGRKPANGREEALNHVEAERQRREKLNQRFYALRAVVPNISK^ 

YITDMQKKIRWETEKQIMKRRESNQITPAEVDYQQRHDDAVVRLSCPLETHPVSKVIQT 

LRENEVMPHDSWAITEEGVVHTFTLRPQGGCTAEQLKDKLIiASLSQ* 

>G1758 (69.. 677) 

GTCCCTCCTCTTAGCTTCAACCGCCGGAAAAACTAAACAACCTTCTTGGAAAAAAAGAGA 

AACTAAAAATGAACTATCCTTCAAACCCTAACCCTAGCTCCACAGATTTCACTGAATTTT 

TCAAGTTCGATGATTTTGACGATACTTTTGAGAAGATCATGGAAGAAATCGGCCGTGAGG 

ACCACTCGTCGTCACCGACTTTGAGTTGGAGTTCATCGGAAAAGTTAGTGGCTGCAGAAA 

TCACAAGCCCGCnTCAAACAAGCCTAGCTACCTCACCTATGAGCTTTGAAATAGGTGACA 

AAGATGAAATCAAAAAGAGGAAGAGACACAAAGAAGATCCGATTATTCACGTCTTCAAAA 

CGAAATCATCAATTGATGAAAAGGTTGCTTTAGATGATGGGTATAAATGGAGGAAATACG 

GAAAGAAGCCGATAACGGGTAGTCCATTTCCAAGGCATTATCACAAGTGTTCGAGCCCAG 

ATTGCAACGTGAAGAAGAAGATCGAAAGAGATACGAACAATCCAGATTACATATTGACAA 

CATACGAAGGTAGACATAACCACCCAAGCCCTTCTGTAGTTTATTGTGATTCAGACGACT 

TTGATCTTAACTCTCTCAACAATTGGTCCTTTCAGACGGCAAATACGTATAGTTTCTCTC 

ATTCTGCTCCATATTGATCGATCGTAGTTACAAGTTTGTGTATATAGATGTATATATATA 

TATCACCAATTCACCATCGTAATCACGTCTCACATGTAACTACGTACATATATCTTGTTC 

GGGGTTCGTTTTGTAATGTATTGAATTGGTGGAGGTAGAATGGAAGTCATCTTGTATAGT 

TGTACTTGTATGTAAGGTTTGATAGTCATTTTTTATAAAGTAACTAATTTGTACAA 

>G1758 Amino Acid Sequence (domain in AA coordinates: TBD) 

MNYPSNPNPSSTDFTEFFKFDDFDDTFEKIMEEIGREDHSSSPTLSWSSSEKLVAAEITS 

PLQTSIATSPMSFEIGDKDEIKKRKRHKEDPIIHVFCT^ 

PITGSPFPRHYHKCSSPDCNVKKKIERDTNNPDYILT^ 

NSLNNWSFQTANTYSFSHSAPY* 

>G2148 (66.. 737) 

GTCTCTAATATAAGCTTGAACGTTGCTATATATAAATGTAAAGGCGAACGCATAAGAAAA 
GAAAAATGGAGAATGAAGCTTTTGTAGATGGTGAATTGGAGTCTCTTTTGGGGATGTTCA 
ACTTTGATCAATGTTCATCTAACGAATCGAGCTTTTG 

TTTTCTCTTCTGATGATTTCTTCCCATTTGGTACAATTCTGCAAAGTAACTATGCGGCCG 
TTCTTGATGGTTCCAACCACCAAACGAACCGAAATGTCGACTCAAGACAAGATCTGTTGA 
AACCAAGGAAGAAGCAAAAGTTAAGCTCGGAAAGCAATTTGGTTACCGAGCCTAAGACTG 
CTTGGAGAGATGGTCAAAGCCTAAGCAGTTATAATAGTTCAGATGATGAAAAGGCTTTAG 
GTTTAGTGTCTAATACATCAAAAAGCCTAAAACGCAAAGCGAAAGCCAACAGAGGGATAG 
CTTCCGATCCTCAGAGCCTATACGCTAGGAAACGAAGAGAAAGGATAAACGATAGGCTAA 
AGACATTGCAGAGCCTAGTTCCTAATGGGACAAAGGTCGATATAAGCACAATGCTGGAAG 
ATGCTGTCCATTACGTGAAGTTCCTGCAGCTTCAAATCAAGCTCTTGAGTTCAGAAGATC 
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TATGGATGTATGCACCTCTTGCTCACAATGGTCTGAATATGGGACTACATCACAATCTTT 

TGTCTCGGCTTATTTAAGACAAAATCATTGGAATAACATAACTTACAGTACTTGTTTTTT 

TTCTCGTTCTATATTCATGATTATGGTTATTTTTTGTTTGAGTTGTTCAATTTTTCTGTC 

TATTGCGTTCTATGAACTTGACACTCTTTTTGTAATT^ 

ACTAACAGCATTTTAATAAAAAAAAAAAA 

>G2148 Amino Acid Sequence (conserved domain in AA coordinates : 130-268) 

MENEAFVDGELESLLGMFNFDQCSSNESSFCNAPNETDVFSSDDFFPFGTILQSNYAAVL 

DGSNHQTNRNVT3SRQDLLKPRKKQKLSSESNLW 

VSNTSKSLKRKAKANRGIASDPQSLYARKRRER^ 

VH YVKFLQLQ I KLLS SEDL WMYAPLAHNGLNMGLHHNLL SRL I * 

>G2379 (52.. 798) 

CGCCGTCACTCTCCTCCCGGTGCCGCACATTAGCAACACTACTCCCGACGAATGGAGACG 
ACGACGCCGCAGTCAAT^ATCAAGTGTGTCCCACCGACCGCCGTTGGGAAGAGAAGACTGG 
TGGAGTGAGGAAGCGACGGCGACGCTGGTAGAAGCCTGGGGCAATCGTTACGTCAAGCTG 
AACCACGGAAATCTCCGGCAGAATGACTGGAAAGACGTCGCCGACGCCGTTAACTCTAGA 
CACGGTGATAACAGCCGTAAGAAGACCGACTTACAGTGTAAGAACCGGGTCGATACTTTG 
AAGAAGAAGTACAAAACAGAGAAAGCTAAACTCTCGCCGTCGACTTGGCGTTTCTATAAC 
CGCCTCGATGTTCTAATCGGTCCCGTTGTGAAGAAATCGGCnK3GCGGAGTTGTCAAATCA 
GCGCCTTTTAAGAATCATCTGAATCCAACTGGATCGAACTCTACTGGAAGCTCTCTTGAA 
GATGATGATGAGGATGATGATGAGGTTGGTGATTGGGAATTCGTTGCTAGGAAGCATCCT 
CGTGTGGAAGAGGTAGATCTGAGTGAAGGATCAACGTGTAGGGAACTAGCTACGGCGATT 
CTCAAGTTTGGAGAAGTTTACGAGAGAATTGAAGGGAAGAAGCAACAGATGATGATTGAG 
TTGGAGAAGCAGAGAATGGAAGTGACAAAGGAGGTAGAGTTAAAACGAATGAACATGTTG 
ATGGAGATGCAGTTAGAGATTGAGAAATCAAAGCACCGGAAACGCGCAAGTGCTTCAGGT 
AAGAAGAACTCACATTAGG 

>G2379 Amino Acid Sequence (domain in AA coordinates : 19-110 , 173-232) 

METTTPQSKSSVSHRPPLGREDWWSEEATATLVE^ 

NSRHGDNSRKKTDLQCKftTRvDTLKKKYKTEKA^ 

V7CSAPFKNHLNPTGSNSTGSSLEDDDEDDDEVGDWEF 

TAILKFGEVTERIEGKKQQMMIELEKQRMEVTKEVBL 

ASGKKNSH* 

>G1462 (63.. 1031) 

CGTCGACCATTCTTGCGATTGATCTTTCTCTAGATAATTTTTTTGATCGATTTAGTTTCA 

TTATGGAGGACGACGACGCAGCTTATGATCTAATCAAACACGAACTGTTATACTCAGAAG 

ACGAAGTAATAATCTCACGTTATCTGAAGGGTATGGTCGTTAACGGAGATTCTTGGCCAG 

ATCACTTCATCGAAGACGCAAACGTGTTCACCAAGAATCCAGATAAGGTGTTCAATTCTG 

AGAGACCTAGATTCGTGATCGTTAAACCACGAACAGAGGCTTGTGGTAAAACCGATGGAT 

GTGATTCGGGTTGCTGGAGGATCATTGGTCGTGATAAACTGATAAAGTCGGAGGAGACTG 

GGAAGATTCTAGGGTTCAAGAAGATACTCAAGTTTTGCCTAAAGAGGAAACCTATAGACT 

ACAAGAGAAGTTGGGTAATGGAAGAGTATAGGCTTACCAATAACTTGAACTGGAAGCAAG 

ATCATGTGATTTGCAAAATTCGGTTTATGTTTGAAGCTGAAATTAGTTTCTTGCTAAGCA 

AGCATTTCTACACTACATCAGAATCGGTTCTTGAAAATGAGCTGTTGCCATCTTATGGAT 

ATTATTTATCCAATACACAAGAGGAGGATGAATTTTATCTGGACGCGATAATGACTTCGG 

AAGGAAACGAGTGGCCTAGCTACGTTACCAACAACGTGTACTGTCTGCATCCATTGGAGC 

TTGTGGATCTTCAAGATCGGATGTTTAATGATTACGGAACCTGCATCTTCGCTAACAAGA 

CTTGTGGTGAAACTGATAAATGCGATGGTGGTTACTGGAAGATCCTGCACGGTGATAAGC 

TGATCAAGTCAAATTTCGGAAAGGTCATTGGTTTCAAGAAGGTATTTGAGTTCTATGAAA 

CGGTGAGACAAATA^ATCTTTGTGATGGAGAAGAAGTGACGGTAACTTGGACTATACAAG 

AGTATAGGCTTAGCAAAAACGTGAAGCAGAATAAAGTGTTGTGCGTTATCAAGTTGACTT 

ATGATAGATAGGATACTTTACTTTGGTTTTTGTGATCATCTTAGTATCTTACGAATATTC 

TAGATACACACATCTATAGGCGACCGCTCTAGACAGGCCTCGTACCG 

>G1462 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEDDDAAYDLIKHELLYSEDEVIISRYLKGMVWGDSWPDHFIEDANVFTKNPDKVFNSE 

RPRFVIVKPRTEACGKTDGCDSGCWRIIGRDKLIKSEETGKILGFKKILKFCLKRKPIDY 

KRSWVMEEYRLTNNLNWKQDHVICKIRFMFEAEISFLLSKHFYTTSESVLENELLPSYG 

YLSNTQEEDEFYLDAIMTSEGNEWPSYVTNNVYCLH^ 

CGETDKCDGGYWKILHGDKLIKSNFGKVIGFKJCVFEFYETVRQIYLCDGEEVTVTWTIQE 
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YRLSKNVKQNKVLCVIKLTYDR* 
>G1211 (44.. 1120) 

TGAAACCTAGATTTCTGCAACTG7UVTTCCTAATTCGAAAAAGAATGGAGGGTTCGTCGTC 

GACGATAGCAAGGAAGACATGGGAACTAGAGAACAGCATTCTAACAGTAGACTCACCTGA 

TTCAACCTCCGACAACATCTTCTACTACGACGATACTTCACAGACTAGGTTCCAGCAAGA 

GAAACCGTGGGAGAATGATCCTCACTACTTTAAACGAGTCAAGATCTCAGCGCTCGCTCT 

TCTTAAGATGGTGGTTCACGCTCGCTCTGGTGGTACAATTGAAATAATGGGTCTTATGCA 

AGGTAAGACCGATGGTGATACTATCATTGTTATGGATGCTTTTGCTTTACCAGTGGAAGG 

TACTGAGACAAGGGTTAATGCTCAGGATGATGCTTATGAGTACATGGTTGAGTATTCACA 

GACCAACAAGCTCGCGGGGCGGCTGGAGAATGTTGTTGGATGGTATCACTCTCACCCTGG 

ATATGGATGCTGGCTCTCCGGTATTGATGTTTCTACGCAGACGCTTAACCAACAGCATCA 

GGAGCCATTTTTAGCTGTTGTTATTGATCCCACAAGGACTGTTTCAGCTGGTAAGGTTGA 

GATTGGTGCTTTCAGAACATACTCTAAAGGATATAAGCCTCCAGATGAACCTGTTTCTGA 

GTATCAAACTATTCCTTTAAATAAGATTGAGGACTTTGGTGTTCACTGCAAACAGTACTA 

TTCATTAGATGTCACTTATTTCAAGTCATCTCTTGATTCTCACCTTCTGGATCTACTATG 

GAACAAGTACTGGGTGAACACTCTTTCTTCTTCTCCACTGCTGGGTAATGGAGACTATGT 

TGCTGGACAAATATCAGACTTAGCTGAGAAGCTTGAGCAAGCCGAGAGTCATCTGGTTCA 

GTCTCGCTTTGGAGGAGTTGTGCCATCATCCCTTCATAAGAAAAAAGAAGATGAGTCTCA 

ACTAACTAAGATAACTCGGGATAGCGCAAAGATAACTGTGGAACAGGTCCATGGACTAAT 

GTCGCAGGTC^TAAAAGATGAATTATTCAACTCAATGCGTCAGTCCAACAACAAATCTCC 

CACTGACTCGTCGGATCC^GACCCTATGATTACATATTGAAGTTGCTCTTCTTTTGGTTT 

CTANTTTTGGATTGACCCATCATTTGTTGTCCTTTCATTTATTTTCTGTTGTGTAAAGAA 

TTATAATGNCGNCGCGAATTCGCGGCCGCTAAAAAAANACAGGAAATTGAAAANAATTCN 

NCCATTCCAACATCTTTATTTAATATTATCTCCTCNATTATATAATATTCAAACATCCCT 

ANTANCTTCATTTGACCGTCCCCCTCCCTCCCGTGTTGCNTTGGTGCTGGCCCC 

>G1211 Amino Acid Sequence (domain in AA coordinates: 123-179) 

MEGSSSTIARKTWELENSILTVDSPDSTSDNIFYYDDTSQTRFQQEKPWEimPHYFKRVK 

I SALALLKMVVHARSGGTIE IMGLMQGKTDGDT 1 I VMDAFALPVEGTETRVNAQDDAYEY 

MVEYSQTNKLAGRLENWGWYHSHPGYGCTLSGIDVSTQTLNQQHQEPFLAWIDPTRTV 

SAGKVEIGAFRTYSKGYKPPDEPVSEYQTIPLNKIEDFGVHCKQYYSLDVTYFKSSLDSH 

LLDLLWNKYWVNTLS S SPLLGNGDYVAGQ I SDIjAEKLEQAE SHLVQS RFGGWPSSLHKK 

KEDESQLTKITRDSAKITVEQVHGLMSQVIKD^ 

>G1048 (5.. 892) 

GACCATGGCGGAGGAATTTGGAAGCATAGATTTACTCGGAGATGAAGATTTCTTCTTCGA 
TTTCGATCCTTCAATCGTAATTGATTCTCTTCCGGCGGAGGATTTTCTTCAGTCTTCACC 
GGATTCATGGATCGGAGAAATCGAGAATCAATTGATGAACGATGAGAATCATCAAGAGGA 
GAGTTTTGTGGAATTGGATCAGCAATCGGTTTCAGATTTCATAGCGGATCTACTCGTTGA 
TTATCCAACTAGCGATTCTGGCTCCGTTGATTTGGCGGCTGATAAAGTTCTAACCGTCGA 
TTCTCCCGCCGCCGCTGATGATTCCGGGAAGGAGAATTCGGATTTGGTTGTTGAGAAGAA 
GTCTAATGATTCTGGTAGCGAGATTCATGATGATGATGACGAAGAAGGAGACGATGATGC 
TGTGGCTAAAAAACGAAGAAGGAGAGTAAGAAATAGAGATGCGGCGGTTAGATCGAGAGA 
GAGGAAGAAGGAATATGTACAAGATTTAGAGAAGAAGAGTAAGTATCTCGAAAGAGAATG 
CTTGAGACTAGGACGTATGCTTGAGTGCTTCGTTGCTGAAAACCAGTCTCTACGTTACTG 
TTTGCAAAAGGGTAATGGCAATAATACTACCATGATGTCGAAGCAGGAGTCTGCTGTGCT 
CTTGTTGGAATCCCTGCTGTTGGGTTCCCTGCTTTGGCTTCTGGGAGTAAACTTCATTTG 
CCTATTCCCTTATATGTCCCACACAAAGTGTTGCCTCCTACGTCCAGAACCAGAAAAGCT 
GGTTCTAAACGGGCTCGGGAGTAGTAGCAAACCGTCTTATACCGGCGTTAGTCGGAGATG 
TAAGGGTTCGAGGCeTAGGATGAAATACCAAATCTTAACCCTTGCGGCGTGACAACGCCT 
TTTTTAACTGCTTCTTTTGCGCATTTTGAGTTGTAGATGAGTGTCTTTTAGTTTTCTCTC 
TCTTGTTTTGTATTTCGCTGTTGAAAGTTTTCTGTCTAATATCGATAAGTTAACAGTGAA 
AAAAAAAAAAAAAAA 

>G1048 Amino Acid Sequence (domain in AA coordinates 138-190) 

MAEEFGSIDLLGDEDFFFDFDPSIVIDSLPAEDFLQSSPDSWIGEIENQLMNDENHQEES 

FVELDQQSVSDFIADLLVDYPTSDSGSVDLAADKVIiTVDSPAAADDSGKENSDLVVEK^ 

NDSGSEXHDDDDEEGDDDAVAKKRRRRVRNRDAAVRSRERKKEYVQDLEKKSKYLERECL 

RLGRMLECFVAENQSLRYCLQKGNG^TTMMSKQESAVLLLESIjLLGSLLWLLGVNFICL 

FPYMSHTKCCLLRPEPEKLVLNGLGSSSKPSYTGVSRRCKGSRPRMKYQILTLAA* 
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>G986 (31.. 846) 

CATTAAATTGGCTCCTGTGAACCTAAATTTATGGACTATGATCCCAACACCAATCCGTTC 

GACCTTCATTTCTCCGGTAAACTTCCGAAAAGAGAAGTCTCGGCTTCAGCITCTAAAGT^ 

GTAGAGAAGAAATGGTTAGTGAAAGATGAGAAGAGAAATATGCTACAAGATGAAATAAAC 

CGGGTTAATTCGGAGAACAAGAAGCTAACCGAAATGTTAGCAAGAGTCTGTGAGAAGTAC 

TATGCTCTTAATAATCTTATGGAGGAGTTGCAGAGTCGAAAGAGTCCTGAAAGTGTTAAC 

TTTCAGAACAAACAGCTAACGGGGAAACGAAAACAAGAACTTGATGAGTTTGTTAGC 

CCAATTGGACTCAGTCTCGGACCAATCGAGAACATCACCAACGATAAAGCGACGGTTTCA 

ACCGCTTACTTTGCTGCTGAGAAGTCTGACACAAGCTTGACTGTGAAAGATGGATATCAA 

TGGAGGAAATACGGGC^AAAGATTACGAGAGATAATC^TCTCCTAGAGCTTACTTCAGA 

TGCTCGTTTTCACCGTCTTGTCTAGTCAAGAAGAAGGTGCAACGAAGTGCAGAAGATCCA 

TCTTTCTTGGTAGCCACTTACGAAGGGACACATAACCACACCGGACCACATGCAAGTGTG 

TCCAGGACAGTGAAACTTGATCTAGTTCAAGGTGGGCTTGAACCAGTTGAGGAAAAGAAA 

GAGAGAGGGACGATTCAAGAGGTTTTGGTGCAACAAATGGCTTCTTCGTTGACCAAAGAT 

CCTAAGTTCACTGCAGCTCTTGCGACTGCTATTTCCGGGAGATTGATAGAGCATTCAAGA 

ACATGAAAGTTCTCTAGAACATGTATATTTCTGTTTTGTTCTATTTTGTTGCTCATTCCT 

AGTAAAAAGGTAAAGATTTGTTTGATCTTGATTAGGAGGCATAGATGTCAATTTTAATGT 

GTGTGTATATAATTACATCAAATCTAAGTATCCAAAAAGGGTCACCCCCATTTTATCTTA 

TG 

>G986 Amino Acid Sequence -(domain in AA coordinates: 146-203) 
MDYDPNTOPFDLHFSGKLPKREVSASASKWEKK^ 

EMLARVCEKYYALNl^MEELQSRKSPESVNFQNKQLTGKRKQELDEFVSSPIGLSLGPIE 
NITIiTDKATVSTAYFAAEKSDTSLTVKDGYQWRKyGQKITRDNPSPRAyFRCSFSPSCLVK 
K3CVQRSAEDPS FLVATYEGTHNHTGPHASVS RTVKLDLVQGGLEPVEEKKERGTI QE VLV 
QQMAS SLTKDPKFTAALATAI SGRLIEHSRT * 
>G789 (259.. 1593) 

GGCAAGAAGAACCTTAGCCTCTCTTTCTTCTTTCTCTCTCTCTCTCTCTGTGGTACTGTT 
CTGTTTCAACTTTACTCCCTCAGTTTCAGAACAATTCCCTATCTAGAAGAGAGATAAAAC 
CGAGAAGGTTTTGGAGATAGAATCTTTTGTTCITCTTTTGTCCCTCCTTGCTCGATTTTT 
GTTACGTGTGAAGCAATAAAAAAAAACTGATATAGCTAAATCTTCCATCCATTCAGAGGC 
TTCTAAATCTGATCTGACATGGAACAAGTGTTTGCTGATTGGAATTTTGAAGATAATTTT 
CACATGTCCACTAATAAAAGATCAATCAGACCAGAAGATGAATTAGTGGAGCTATTGTGG 
AGAGATGGTCAAGTGGTTTTACAAAGCCAAGCTCGTAGAGAACCGTCAGTCCAAGTCCAA 
ACCCACAAACAAGAAACCCTAAGAAAACCCAACAATAT^ 

GTACAAAAGCCTAACTACGCTGCTCTAGATGATCAAGAAACCGTCTCCTGGATACAATAC 

CCTCCGGATGACGTCATCGACCCTTTCGAATCCGAGTTCTCCTCTCATTTCTTCTCTTCG 

ATCGATCACCTCGGAGGTCCTGAGAAGCCACGAACGATCGAAGAGACAGTTAAGCATGAG 

GCTCAAGCCATGGCTCCTCCTAAGTTTAGATCCTCGGTTATAACAGTCGGACCGAGTCAT 

TGCGGC^GCAACCAGTCAACAAATATTCATCAGGCCACTACACTTCCGGTTTCTATGAG^ 

GATAGAAGCAAGAACGTCGAAGAAAGACTTGACACTTCGTCAGGTGGCTCCTCCGGTTGC 

AGCTATGGAAGGAACAACAAAGAAACCGTTAGTGGAACAAGTGTAACCATTGACCGT 

AGAAAACATGTTATGGATGCTGATCAAGAATCTGTGTCTCAATCAGATATAGGTTTGACC 

TCAACCGATGATCAAACCATGGGTAACAAATCGAGCCAACGGTCAGGATCTACTCGAAGA 

AGCCGTGCAGCTGAAGTTCATAATCTCTCAGAAAGGAGGAGGAGAGATCGGATCAATGAA 

AGAATGAAAGCTCTTCAAGAACTCATACCTCACTGCAGCAGAACAGATAAAGCTTCGATA 

TTGGATGAAGCAATTGATTACTTAAAATCACTTCAAATGCAACTCCAAGTGATGTGGATG 

GGAAGTGGAATGGCGGCGGCGGCAGCAGCAGCAGCAAGTCCGATGATGTTTCCCGGGGTA 

CAATCATCTCCATACATTAATCAGATGGCTATGCAAAGTCAGATGCAATTGTCTCAATTC 

CCGGTTATGAACCGGTCCGCTCCGCAGAACCATCCCGGTTTAGTATGTCAAAACCCGGTA 

CAGTTGCAGCTCCAAGCACAGAACCAAATCTTATCGGAGCAGCTCGCTAGGTACATGGGC 

GGGATTCCCCAGATGCCGCCGGCGGGAAATCAGATGCAGACCGTGCAACAACAACCAGCG 

GACATGTTGGGATTTGGATCTCCGGCGGGACCGCAAAGTCAACTGTCGGCACCGGCGACC 

ACCGACAGTCTTCATATGGGTAAAATAGGCTGACTTGGCATATAGTTTTCCTCCGAAATT 

ATTCTTCTTACAGTTGGTGATTGTTATTTATTTTTGGTCGCCTAAGCAAGCATAAAAGCT 

AAGTCAAATGTATTATAGAGATCTAATAAGTTAGTCTCATACTTATAACTTATTTTTAAA 

CAGTTGAATTATAGTATCAATCAAGTGTTGGGAACCTAAAGATCATACATGTGTCAATAC 

TTTTATATTTGTTCTCAAGGTTCATCAGAAAAACAAAATAAAAAGGATAGACTAGGCCTG 
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CATTTGACATTATCATGGGCTTTTTTGGGTCTATGAATATGAACATTAACCCC * 

>G789 Amino Acid Sequence (domain in AA coordinates: 253-313) 

MEQVFADWNFEDNFHMSTNKRSIRPEDELVELLWRDGQVVLQSQARREPSVQVQTHKQET 

LRKPNNIFLDNQETVQKPMYAALDDQETVSWIQYPPDDVIDPFESEFSSHFFSSIDHIiGG 

PEKPRTIEETVKHEAQAMAPPKFRSSVITVGPSHCGSNQSTNIHQATTLPVSMSDRSKNV 

EERLDTSSGGSSGCSYGRNNKETVSGTSVTIDRKRKHVMDADQESVSQSDIGLTSTDDQT 

MGNKSSQRSGSTRRSRAAEVHNLSERRRRDRINERMKALQELIPHCSRTDKASILDEAID 

YLKSLQMQLQVMWMGSGMAAAAAAAASP^FPGVQSSPYINQMAMQSQMQLSQFPVMNRS 

APQNHPGLVCQNPVQLQLQAQNQILSEQIiARYMGGIPQMPPAGNQMQTVQQQPADMLGFG 

SPAGPQSQLSAPATTDSLHMGKIG* 

>G2085 (1..930) 

ATGTTTGGTCGCCATTCGATTATCCCAAATAACCAGATTGGTACCGCCTCTGCTTCCGCT 
GGTGAAGACCATGTCTCTGCCTCCGCTACGTCTGGTCACATTCCTTACGACGATATGGAA 
GAAATCCCTCATCCTGACTCTATCTATGGTGCTGCCTCCGATTTGATTCCCGATGGCTCT 
CAATTGGTTGCTCACCGATCCGATGGCTCTGAATTACTTGTTTCTCGGCCACCGGAAGGG 
GCGAATCAGCTTACGATCTCGTTCCGTGGACAAGTTTACGTTTTTGATGCCGTTGGTGCT 
GACAAGGTGGATGCTGTGTTGTCGCTGTTGGGTGGTTCTACTGAGCTTGCTCCTGGTCCG 
CAGGTGATGGAACTAGCTCAACAGCAGAATCATATGCCTGTTGTAGAATATCAGAGCCGC 
TGTAGCCTTCCGCAACGGGCACAATCCTTGGATAGGTTTCGGAAGAAGAGGAATGCTAGA 
TGTTTCGAGAAGAAAGTAAGATACGGTGTTCGCCAAGAAGTTGCCTTAAGAATGGCACGT 
AATAAAGGTCAATTCACCTCTTCAAAGATGACAGATGGGGCTTATAACTCTGGCACAGAT 
CAAGATTCTGCCCAAGATGATGCCCATCCAGAAATATCGTGTACTCATTGCGGCATTAGT 
TCCAAATGTACACCAATGATGCGACGTGGCCCTTCCGGCCCCAGGACTCTCTGCAATGCC 
TGTGGACTTTTTTGGGCTAACAGGGGTACATTGAGGGATCTCTCAAAGAAAACAGAAGAG 
AATCAGTTGGCXTTAATGAAACCGGATGATGGTGGGAGTGTTGCTGATGCTGCTAACAAC 
TTAAACACTGAAGCTGCAAGTGTTGAAGAACACACTTCCATGGTTTCTCTTGCC^TGG^ 
GATAATTCTAATCTGTTAGGTGATCACTAA 

>G2085 Amino Acid Sequence {domain in AA coordinates: TBD) 
MFGRHS I IPNNQIGTASASAGEDHVSASATSGHIPYDDMEEI PHPDS IYGAASDLI PDGS 
QLVAHRSDGSELLVSRPPEGANQLTISFRGQVYVFDAVGADKVDAVLSLLGGSTELAPGP 
QVMELAQQQNHMPVVEYQSRCSLPQRAQSLDRFRKKRNARCFEKKVRYGVRQEVALRM^ 
NKGQFTSSKMTDGAYNSGTDQDS AQDDAHPEI SCTHCGI S SKCTPMMRRGPSGPRTLCNA 
CGLFWANRGTLRDLSKKTEENQLALMKPDDGGSVADAANNLNTEAASVEEHTSMV 

DNSNLLGDH* 
>G1783 (1..603) 

ATGGCCGCGTTTCCGCAGTGGACAAGGGTCGATGACAAACGTTTTGAGTTAGCTCTGCTT 
CAAATCCCGGAGGGTTCGCCGAATTTTATAGAGAATATCGCCTATTATCTCCAGAAACCG 
GTGAAGGAGGTGGAGTACTACTACTGCGCGTTGGTCCATGATATTGAGCGGATCGAATCG 
GGTAAGTATGTTTTGCCCAAATACCCGGAAGACGATTACGTGAAACTGACGGAAGCAGGT 
GAGTCTAAGGGCAATGGGAAAAAGACGGGAATTCCTTGGTCAGAAGAGGAACAGAGGTTG 
TTTCTGGAAGGACTAAATAAGTTTGGGAAAGGAGACTGGAAGAACATATCGAGGTATTGT 
GTGAAGTCAAGGACCTCGACGCAAGTGGCAAGCC^TGCTCAGAAGTATTTTGCAAGGCAA 
AAGCAGGAGAGTACGAATACTAAACGCCCGAGTATTCATGACATGACTCTGGGAGTTGCG 
GTCAATGTCCCTGGATCCAACTTGGAGTCTACTGGCCAGCAACCACATTTTGGTGATCAA 
ATTCCTTCGAATCAATATTATCCCTCCCAGGAAAACTTTCGGGGTTTTGATCAGCGATGG 

TGA 

>G1783 Amino Acid Sequence (domain in AA coordinates: 81.. 129) 
MAAFPQWTRVDDKRFELAIiLQIPEGSPNFIENIAYYLQKPVK^ 

GKYVLPKYPEDDYVKLTEAGESKGNGKKTGIPWSEEEQRLFLEGLNKFGKGDWKNISRYC 
VKSRTSTQVASHAQKYFARQKQESTNTKRPSIHDMTLGVAVNVPGSNLESTGQQPHFGDQ 
I PSNQYYPSQENFRGf DQRW * 
>G2072 (155.. 793) 

TCGACCCACGCGTCCGCCCACGCGTCCGGATCTTTTCACAGAAGACCAACCAGCTTGGCT 
CGATGAGCTCCTAAGTGAGCCAGCATCACCTAAGATTAACAAAGGTCATAGACGTTCAGC 
TAGTGACACAGCTGCTTACTTGAACTCAGCTTTAATGCCTTCGAAGGAAAATCATGTTGC 
TGGTTCGTCTTGGCAGTTCCAGAACTATGATTTGTGGCAGTCCAACTCTTATGAACAACA 
CAATAAATTAGGATGGGATTTCTCTACAGCAAATGGAACTAATATCCAAAGAAATATGTC 
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ATGCGGAGCTTTAAATATGTCGTCGAAACCCATTGAGAAACATGTAAGCAAAATGAAAGA 

AGGAACTTCTACAAAACCAGATGGTCCTAGATCAAAGACTGACTCAAAACGTATCAAACA 

TCAAAATGCTCATCGAGCGCGTTTGAGAAGGCTTGAGTACATATCAGACCTTGAAAGGAC 

CATCCAAGTGCTACAAGTTGAAGGATGTGAAATGTCATCTGCCATTCACTACTTGGATCA 

GCAGTTACTCATGCTTAGCATGGAAAATAGAGCTTTAAAACAACGTATGGATAGTTTAGC 

AGAAATCCAAAAGCTTAAACATGTGGAGCAGCAATTGCTTGAGAGAGAGATAGGAAACCT 

ACAGTTTCGACGACACCAACAACAACCACAGCAAAACCAAAAACAAGTCCAAGCAATACA 

AAATCGATACACCAAATATCAACGACCTGTTACACAAGAACCCGATGCCCAATTTGCAGC 

CTTGGCAATATGATTTAGGAAATATGGATACATTGTTCAGATTAAGCTGAGCTCCTCTTG 

CTCTACCTTAATGTCCATACAACATAGGTGAACTTGATGTTTGTAGCCTTGAATGAAAAC 

CTAAAA7VAGGATCGTTATGTAAATCAAAATGTGGTTGCCCATATCCTCCTCTATTGCATT 

TCTCTCTATTTATGGCATGGTAGAGAATCTCTTGTCAAGAAACTTCATGTTATGTAATAA 

CTTGTAATCCTTCTTATTTCATCTATTATATATATGAATAAGTAATTTTTTTGC 

AAAAAAAAAAAAAAAAAAA 

>G2072 Amino Acid Sequence (conserved domain in AA coordinates : 90-149) 

MPSKENHVAGSSWQFQNYDLWQSNSYEQHNKLGWDFSTANGTNIQRNMSCGALNMSSKPI 

EKHVSKMKEGTSTKPIXJPRSKTDSKRIKHQN 

SSAIHYLDQQLLMLSMENRALKQRI^SIiAEIQKLKHV^ 

NQKQVQAI QNRYTKYQPP VTQE PDAQFAALAI * 

>G931 (85.. 1071) 

GGAGGTTCTTTGACAGACACATGTATCATCAATCTTCTCTGTTGAAGCAGAGAGAGAGAG 
AGCTAATTGTTGCCTCTGAGTCACATGGATAAGAAAGTTTCATTTACTAGCTCTGTGGCA 
CATTCAACTCCACCATACCT^ 

GGTGTGACTGAATCACTGAGTTTGAAGGTGGTAGATGCAAGACCAGAACGTCTTATAAAC 
A(^AAGAATATCAGTTTC(^GGACCAGGATTGATCTT(^ACTCTGTCCTCTGCTCAATCT 
TCTAACGATGTTACAAGTAGTGGAGATGATAACCCCTCAAGAGAAATCTCATTTTTAGCA 
CATTCAGATGTTTGTAAAGGATTTGAAGAAACTCAAAC^ 

GGCTCCTCCACGGCAGGAATCGCTGATATTCACTCTTCTCCTTCCAAGGCTAACTTCTCA 
TTTCACTATGCCGATCCA(^TTTTGGTGG 

ACAATATGGAATCCCCAAATGACTCGAGTTCCGCTACCATTCGATCTCATAGAGAATGAG 
CCTGTCTTTGTCAATGCAAAGCAATTCCATGCAATTATGAGGAGGAGGCAACAGCGTGCT 
AAGCTAGAGGCGCAAAACAAACTAATCAAAGCCCGTAAGCCGTATCTTCATGAATCTCGA 
CATGTTCACGCTCTTAAA.CGACCTAGAGGATCTGGTGGAAGATTCCTAAACACCAAAAAG 
CTTCAAGAATCTACAGATCCAAAACAAGACATGCCAATCCAACAGCAACACGCAACGGGA 
AACATGTCAAGATTTGTGCTTTATCAGTTGCAGAACAGCAATGACTGTGATTGTTCAACC 
ACTTCTCGCTCTGACATCACATCTGCTTCTGAC^GCGTTAATCTCTTTGGACACTCTGAA 
TTTCTGATAT(^GATTGCCCATCTCAGACAAACCCAACAATGTATGTTCATGGTCAATCA 
AATGACATGCATGGAGGTAGGAACACACACCATTTCTCTGTCCATATCTGAGCCGGTGGA 
ATCTGGTAATGTGTACGTTCCTACAAAAAAAGGGAAGTCATCCTTGGCTGCTACTTCGCT 
TATTAGCTAGTTCTTATTTCACACGCTTTGTCCAGATATC 

>G931 Amino Acid Sequence (domain in AA coordinates : TBD) 

I^KKVSFTSSVAHSTPPYLSTSISWGLPTKSNGvTESLSLKNAHDARPERLINTKNISFQ 

QDSSSTLSSAQSSNDVTSSGDDNPSRQISFLAHSDVCKGFEETQRKRFAIKSGSSTAGIA 

DIHSSPSKANFSFHYADPHFGGLMPAAYLPQATIWNPQMTRVPLPFDLIENEPVFVNAKQ 

FHAIMRRRQQRAKLEAQNKLIKARKPYLHESRHVHAL^ 

QDMPIQQQHATGNMSRFVXjYQLQNSNDCDCSTTSRSDITSASDSVNLFGHSEFLISDCPS 
QTNPTMYVHGQSNDMHGGRNTHHFS VHI * 
>G278 (93.. 187*) 

TCGATCTTTAACCAAATCCAGTTGATAAGGTCTCTTCGTTGATTAGCAGAGATCTCTTTA 
ATTTGTGAATTTCAATTCATCGGAACCTGTTGATGGACACCACCATTGATGGATTCGCCG 
ATTCTTATGAAATCAGCAGCACTAGTTTCGTCGCTACCGATAACACCGACTCCTCTATTG 
TTTATCTGGCCGCCGAACAAGTACTCACCGGACCTGATGTATCTGCTCTGCAATTGCTCT 
CCAACAGCTTCGAATCCGTCTTTGACTCGCCGGATGATTTCTACAGCGACGCTAAGCTTG 
TTCTCTCCGACGGCCGGGAAGTTTCTTTCCACCGGTGCGTTTTGTCAGCGAGAAGCTCTT 
TCTTCAAGAGCGCTTTAGCCGCCGCTAAGAAGGAGAAAGACTCCAACAACACCGCCGCCG 
TGAAGCTCGAGCTTAAGGAGATTGCCAAGGATTACGAAGTCGGTTTCGATTCGGTTGTGA 
CTGTTTTGGCTTATGTTTACAGCAGCAGAGTGAGACCGCCGCCTAAAGGAGTTTCTGAAT 
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GCGCAGACGAGAATTGCTGCCACGTGGCTTGCCGGCCGGCGGTGGATTTCATGTTGGAGG 
TTCTCTATTTGGCTTTCATCTTCAAGATCCCTGAATTAATTACTCTCTATCAGAGGCACT 
TATTGGACGTTGTAGACAAAGTTGTTATAGAGGACACATTGGTTATACTCAAGCTTGCTA 
ATATATGTGGTAAAGCTTGTATGAAGCTATTGGATAGATGTAAAGAGATTATTGTCAAGT 
CTAATGTAGATATGGTTAGTCTTGAAAAGTCATTGCCGGAAGAGCTTGTTAAAGAGATAA 
TTGATAGACGTAAAGAGCTTGGTTTGGAGGTACCTAAAGTAAAGAAACATGTCTCGAATG 
TACATAAGGCACTTGACTCGGATGATATTGAGTTAGTCAAGTTGCTTTTGAAAGAGGATC 
ACACCAATCTAGATGATGCGTGTGCTCTTCATTTCGCTGTTGCATATTGCAATGTGAAGA 
CCGC^CAGATCTTTTAAAACTTGATCTTGCCGATGTCAACCATAGGAATCCGAGGGGAT 
ATACGGTGCTTCATGTTGCTGCGATGCGGAAGGAGCCACAATTGATACTATCTCTATTGG 
AAAAAGGTGCAAGTGCATCAGAAGCAACTTTGGAAGGTAGAACCGCACTCATGATCGCAA 
AACAAGCCACTATGGCGGTTGAATGTAATAATATCCCGGAGCAATGCAAGCATTCTCTCA 
AAGGCCGACTATGTGTAGAAATACTAGAGCAAGAAGACAAACGAGAACAAATTCCTAGAG 
ATGTTCCTCCCTCTTTTGCAGTGGCGGCCGATGAATTGAAGATGACGCTGCTCGATCTTG 
AAAATAGAGTTGCACTTGCTCAACGTCTTTTTCCAACGGAAGCACAAGCTGCAATGGAGA 
TCGCCGAAATGAAGGGAACATGTGAGTTCATAGTGACTAGCCTCGAGCCTGACCGTCTCA 
CTGGTACGAAGAGAACAT(^CCGGGTGTAAAGATAG(^CCTTTCAGAATCCTAGAAGAGC 
ATCAAAGTAGACTAAAAGCGCTTTCTAAAACCGTGGAACTCGGGAAACGATTCTTCCCGC 
GCTGTTCGGCAGTGCTCGACCAGATTATGAACTGTGAGGACTTGACTCAACTGGCTTGCG 
GAGAAGACGACACTGCTGAGAAACGACTACAAAAGAAGCAAAGGTACATGGAAATACAAG 
AGACACTAAAGAAGGCCTTTAGTGAGGACAATTTGGAATTAGGAAATTCGTCCCTGACAG 
ATTCGACTTCTTCCACATCGAAATCAACCGGTGGAAAGAGGTCTAACCGTAAACTCTCTC 
ATCGTCGTCGGTGAGACTCTTGCCTCTTAGTGTAATTTTTGCTGTACCATATAATTCTGT 
TTTCATGATGACTGTAACTGTTTATGTCTATCGTTGGCGTCATATAGTTTCGCTCTTCGT 
TTTGCATCCTGTGTATTATTGCTGCAGGTGTGCTTCAAACAAATGTTGTAACAATTTGAA 
CCAATGGTATACAGATTTGTAATATATATTTATGTACATCAAC^TAAAAAAAAAAAAAA 
AAAA 

>G278 Amino Acid Sequence (domain in AA coordinates: 2-593) 

MDTTIDGFADSYEISSTSFVATDNTDSSIVYLAAEQVLTGPDVSALQLLSNSFESVFDSP 

DDFYSDAKLVLSDGREVSFHROTLSARSSFFKSALAA 

YEVGFDSVVTVIAyWSSRVRPPPKGVSECADENCCHVACRPAVDFMLEVLY FKI P 
ELITLYQRHLLJDVVDKWIEDTLVIL 

LPEELVKEIIDRRKELGLETVPKVKKHVSNVHKALDSDDIEL 
FAVAYCNVKTATDLLKXI)LADVNHRNPR 

EGRTALMI AKQATMAVECNNI PEQCKHS LKGRLCVE ILEQEDKREQ I PRDVPPS FAVAAD 
ELKMTLLDLENRVAIAQRLFPTEAQAT^EIAEMKGTCEFIVTSLEPDRLTGTKRTSPGVK 
IAPFRILEEHQSRLKALSKTVELGKRFFPRCSAVLDQIMNCEDLTQLACGEDDTAEKRLQ 
KKQRYMEIQETLKKAFSEDNLELGNSSIiTDSTSSTSKSTGGKRSNRKLSHRRR* 
>G2421 (1..630) 

ATGGAGGGTTCGTCCAAAGGGTTGAGGAAAGGTGCATGGACTGCTGAAGAAGATAGTCTC 
TTGAGGCAGTGTATTGGTAAGTATGGAGAAGGCAAATGGCATCAAGTTCCTTTAAGAGCT 
GGGCTAAATCGGTGCAGGAAAAGTTGTAGACTAAGATGGTTAAACTATTTGAAGCCAAGT 
ATCAAGAGAGGAAAATTTAGTTCTGATGAAGTTGATCTTCTTCTTCGTCTTCATAAGCTT 
CTAGGAAATAGGTGGTCCTTGATTGCTGGTCGATTACCTGGTCGGACCGCTAATGATGTC 
AAGAACTACTGGAACACCCATCTGAGTAAGAAGCATGAACCGTGTTGTAAAACTAAGATA 
AAAAGGATAAATATTATAACCCCTCCTAATACACCGGCCCAAAAAGTTTGTGAAAATAGT 
ATCACATGTAACAAAGATGATGAGAAAGATGATTTTGTGGATAATTTTATGGTTGGAGAT 
AATATATGGTTGGAGCGTTTGCTAGACGAGGGCCAAGAGGTAGATGTGCTGGTTACAGAA 
GCGGCGGCAACAGAAAAGGAGGGCACTTTGGCGTTTGACGTTGAGCAACTTTGGAATTTG 
TTCGATGGAGAGACTGTGATCTTTGATTAGTGTTTATAAACGTTTGTGTTCTCTTGTTTG 
TGAGGTTTCTCTATTTAATTTAGTATCTATTTTCTAAATTAACTAATATCTTATAGTATT 
TTAGGCAAACCTTATGTTTCCGTTTCTGTGCGGCCGCTCTAG 

>G2421 Amino Acid Sequence (domain in AA coordinates: 9-110) 
MEGS S KGLRKGAWTAEED SLLRQC IGKYGEGKWHQVPLRAGLNRCRKS CRLRWLNYIjKPS 
IKRGKFSSDEVDLLLRLHKLLGNRWSLIAGRLPGRTAITOV^ 

KRINI ITPPNTPAQKVCENS ITCNKDDEKDDFVDNFMVGDNIWLERLLDEGQEVDVLVTE 
AAATEKEGTLAFDVEQLWNLFDGETVIFD* 
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>G2032 (53.. 1789) 

TCCCTCCCAGAGTAAGAACTTCCATACTTTGCTCTAGATTTCTTGAGAAAAGATGCAGCC 

GATCTTCCATGCGATCCTTAAAAATGACCTTCCAGCTTTTTTAGAGTTGGTAGAAG 

TGAATCGTCTCTGGAGGAGAGAAACGAGGAAGAACACTTGAACAACACGGTTTTGCACAT 

GGCTGCAAAGTTTGGTCACCGAGAACTCGTCTCCAAGATTATTGAGCTCCGACCTTCCCT 

CGTGTCTTCCCGCAACGCATACAGAAACACACCTTTGCATCTTGCTGCTATCCTTGGAGA 

TGTAAACATAGTTATGCAGATGTTAGAGACTGGATTGGAAGTGTGTTCTGCACGCAATAT 

CAACAACCAC^CACCACTCCACTTGGCTTGCCGTAGCAATTCCATAGAGGCTGCCAGACT 

CATCGCGGAA/U^GACACAATCAATTGGCCTCGGTGAACTCATTCTCGCCATATCAAGTGG 

ATCCACTAGTATCGTAGGGACTATACTGGAGAGATTCCCAGACCTAGCTAGGGAAGAAGC 

TTGGGTGGTTGAAGACGGCTCACAATCAACGCTACTGCATCATGCGTGTGATAAGGGAGA 

CTTTGAACrGACAACTATATTGTTAGGGCTCGATCAAGGATTAGAAGAAGCACTTAACCC 

GAATGGTTTATCACCTCTGCATCTTGCGGTCCTCAGAGGCTCGGTTGTGATCCTGGAGGA 

GTTCTTGGACAAGGTTCCATTGTCTTTCAGCTCAATCACGCCGTCGAAAGAGACAGTCTT 

TCATCTCGCTGCTCGATU^CAAAAATATGGATGCCTTTGTTTTTATGGCAGAGAGTTTGGG 

AATTAACAGCCAAATTCTTCTACAGCAAACCGAT 

TGCTGCATCCGTCTCTTTTGATGCTCCTCTTATACGTTACATTGTTGGTAAGAATATAGT 

AGATATCACGTCCAAGAACAAGATGGGTTTTGAAGCTTTTCAACTTC 

CCAAGACTTTGAGTTGTTATCAAGGTGGCTGAGATTTGGTACCGAGACTTCACAAGAGCT 

GGATTCTGAGAACAATGTAGAACAACACGAAGGCTCTCAAGAGGTCGAGGTAATACGGTT 

GCTAAGGATTATAGGAATAAACACATCAGAGATAGCAGAGAGAAAGAGAAGCAAGGAACA 

GGAAGTGGAAAGAGGTCGTCAGAACTTGGAATATCAGATGCATATAGAAGCATTACAGAA 

TGCAAGAAATACGATTGCTATAGTGGCAGTCTTGATTGCTTCAGTTGCTTATGCCGGTGG 

GATAAACCCTCCGGGGGGCGTCTACCAAGACGGGCCATGGAGAGGGAAATCCTTAGTGGG 

GAAAACAACGGCGTTTAAGGTCTTTGCGATATGCAACAACATCGCACTGTTCACGTCCTT 

GGGCATCGTTATTCTTCTTGTTAGCATCATACCTTACAAGAGGAAACCCTTAAAGAGATT 

ATTGGTGGCCACGGATAGGATGATGTGGGTTTCTGTAGGTTTCATGGCGACGGCTTATAT 

AGCGGCGTCTTGGGTGACCATACCGCATTATCATGGAACACAATGGTTATTTCCAGCAAT 

TGTAGCCGTTGCTGGTGGAGCGTTGACCGTACTCTTTTTCTATCTCGGAGTTGAGACCAT 

CGGTCATTGGTTTAAGAAGATGAATCGTGTAGGGGATAATATACCTTCCTTTGCAAGAAC 

CAGTTCAGATTTAGCCGTCTCCGGAAAATCAGGCTATTTCACCTATTAAGAAAAACTGGT 

TTTCTAATTTCCCTGTAACCTGTGTAATTGTGTATGTG 

>G2032 Amino Acid Sequence (domain in AA coordinates: entire protein) 
MQPIFHAILKNDLPAFLELVT3DSESSLEERNEE 

PSLVSSRNAYRNTPLHLAAILGDWIVMQMLETGLEVCSARNINNHTPLHLACRSNSIEA 
ARLIAEKTQSIGLGELIIAISSGSTSIVGTILERFPDLAREEAWVVEDGSQSTLLHHACD 
KGDFELTTILLGLDQGLEEAIiNPNGLSPIiHLAVLRGSVVILEEFLDKVPLSFSSITPSKE 
TVFHLAARNKNMDAFVFMAESLGINSQILLQQTDESGNTVLHIAASVSFDAPLIRYIVGK 
NIVDITSKEnOVIGFEAFQLLPREAQDFELLSRWLRFGT^^^ 

IRLLRIIGINTSEIAERKRSKEQEVERGRQNLEYQMHIEALQNARNTIAIVAVLIASVAY 
AGGINPPGGVYQDGPWRGKSLVGKTTAFKVFAICNNIALFTSLGIVILLVSIIPYKRKPL 
KRLLVATHRMMWVS VGFMATAY IAAS WVT I PHYHGTQWLFPAI VAVAGGALTVL FF YLGV 
ETIGHWFKKMNRVGDNIPSFARTSSDLAVSGKSGYFTY* 
>G1396 (83.. 313) 

TCGACCTCGTTTCCTTTCCTCCTCTCTTCCTACCATTAGTACGTTACTGGAGCTGATCTC 
ACGTATATTTTGGATCGTAATCATGGACGGCGAAGATTTTGCCGGAAAGGCGGCTGCTGA 
AGCCAAGGGATTGAACCCGGGATTAATCGTGCTGCTTGTTGTTGGAGGTCCGCTTCTTGT 
GTTCCTAATCGCCAACTACGTGCTTTACGTTTATGCTCAGAAGAACCTACCTCCAAGGAA 
GAAGAAGCCCGTTTCCAAAAAGAAGCTCAAGCGGGAGAAGCTAAAGCAAGGAGTCCCTGT 
CCCTGGAGAATAAAAGCCAGCTTAAGCTTCCTTCACTTGTGCCTCCTTCAAAGCGGTTTT 
TGTTCGGTTACCAAATTTCACCCTTGCGGGTTTTTTTCTTCCTTTACTTCTGTCATGAGG 
ATTATCTTTGAGGCCT 

>G1396 Amino Acid Sequence (domain in AA coordinates: TBD) 
MDGEDFAGKAAAEAKGLNPGL I VLLWGGPLLVFL I ANYVLYVYAQKNLPPRKKKPVS KK 
KLKREKLKQGVPVPGE* 
>G619 (382.. 2748) 

ATTTTTTTCCAATCTGCAAATTTTAGTCTATGTCTGTTCCTTGTGCTCCCTCTTCTCAGT 
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ACCTGCAAATGGAGGAAGAAGAATCCTTCTCTGAAACCCTTGTTCTCATTTGATTCCTCC 
TTCCTCTCTCTTCTTCTCTCTCTGTCTCTGATTCGTTATTCCACACTTATGACTCATCTT 
TCCCGTCAATAGCTAAGTTTGCCTCTTCTTTGTGAAATTTAGCTGAAAAAGGAGAGGAAT 
TCCGAATTCTGTCACTTCAAAGCTCGAATTTTGCAAACT 

GTTTTGTTGTAATCTGATTAAAAATAGAAACTTTTTGTTTTCTTCTTGTCTCCTTTTGCT 

CTTAAAAGAGAAGCTTTTTCAATGGAATTTGACTTGAATACTGAGATTGCGGAGGTGGAA 

GAGGAGGAGAATGATGATGTAGGAGTAGGAGTAGGAGGAGGAACAAGAATTGACAAGGGT 

AGGCTTGGAATTTCACCATCTTCTTCTTCTTCATGCTCTTCCGGATCATCATCGTCATCA 

TCTTCTACAGGCTCTGCATCTTCCATTTACTCTGAGCTTTGGCATGCTTGTGCTGGTCCT 

CT(^CTTGTCTTCCCAAGAAAGGCAATGTAGTTGTCTATTTCCCTCAAGGTCATTTGGAG 

CAAGATGCTATGGTTTCATATTCGTCTCCTCTTGAAATCCCCAAATTTGACCTTAATCCC 

CAAATCGTCTGCAGGGTGGTTAATGTCCAGTTGCTTGCTAATAAGGACACCGATGAGGTC 

TACACTCAAGTCACTCTGCTTCCACTTCAAGAGTTTTCGATGCTAAATGGGGAGGGGAAA 

GAGGTCAAGGAGTTAGGAGGGGAGGAAGAGAGGAACGGAAGCTCATCCGTCAAGCGGACA 

CCTCATATGTTCTGTAAAACCTTAACAGCGTCTGAGACAAGCACACATGGAGGCTTCTCT 

GTACCTAGAAGAGCCGCTGAAGATTGTTTTGCTCCTCTTGACTACAAACAACAGAGGCCA 

TCTCAAGAGCTCATTGCAAAGGACCTCCATGGAGTAGAGTGGAAGTTTCGCCATATCTAT 

AGAGGTCAACC^GGAGGCATCTACTCACCACTGGTTGGAGTATCTTTGTCAGTCAAAAG 

AATCTCGTCTCTGGTGATGCGGTTCTCTTTCTGAGAGACGAAGGAGGAGAGCTGAGATTA 

GGAATCAGAAGAGCAGCACGGCCAAGAAATGGACTTCCTGACTCAATCATTGAGAAGAAT 

TCATGTTCAAACATTCTGTCTCTTGTGGCTAATGCTGTATCTACAAAAAGCATGTTTCAT 

GTGTTCTACAGTCCACGAGCGACGCATGCAGAGTTTGTGATTCCTTATGAGAAGTATATC 

ACAAGCATCAGGAGTCCTGTTTGCATAGGCACAAGATTTAGAATGCGATTTGAAATGGAC 

GATTCTCCTGAGAGAAGATGCGCTGGTGTAGTGACTGGAGTCTGTGACTTGGACCCGTAT 

AGGTGGCC^yUVCTCTAAATGGAGGTGCTTGTTGGTGCGATGGGATGAGTCTTTTGTGAGT 

GAT<^TCAAGAAAGAGTTTC^CCTTGGGAGATTGATCCCTCGGTTTCTCTCC(^CACTTG 

AGCATTCAGTCATCTCCAAGGCCTAAAAGGCCATGGGCAGGTTTACTGGATACTACCCCA 

CCCGGAAACCCCATAACAAAAAGGGGTGGTTTTTTGGACTTTGAGGAGTCGGTTAGACCC 

TCTAAGGTCTTGCAAGGT(^^GAAAATATAGGTTCTGCATCACCCTC^CAGGGGTTTGAT 

GTTATGAACCGCCGGATACTGGATTTTGCGATGCAGTCTCATGCAAATCCAGTCCTTGTG 

TCGAGTAGAGTCAAGGATCGATTTGGTGAGTTTGTAGATGCTACTGGCGTGAACCCAGCT 

TGTTCAGGTGTTATGGACCTGGATAGGTTTCCAAGGGTCTTGCAAGGTCAAGAAATTTGC 

TCGCTTAAATCATTCCCGCAATTTGCTGGTTTCAGTCCAGCTGCTGCTCCTAATCCCTTT 

GCTTACCAAGCCAACAAGTCAAGTTACTATCCGCTAGCTTTGCATGGGATTAGGAGCACT 

CATGTTCCGTATCAGAATCCATACAATGCGGGAAACCAATCCTCGGGTCCCCCTTCACGT 

GCAATAAACTTTGGTGAAGAGACTAGAAAGTTTGATGCACAAAATGT^AGGTGGCCTACCA 

AATAATGTTACAGCTGATTTGCCATTC1AAGATTGATATGATGGGAAAACAGAAAGGCAGT 

GAGTTGAATATGAATGCTTCATCAGGATGTAAACTTTTCGGATTCTCCTTACCAGTGGAG 

ACACCTGGATCTAAGCCGCAAAGCTCGAGCAAAAGAATCTGT^ 

GGAAGCC^^GTGGGGAGAGCTATTGATTTGTCGCGACTTAACGGGTATGATG 

ATGGAGCTTGAACGGCTGTTCAACATGGAAGGGCTTCTCAGGGATCCTGAAAAAGGATGG 

AGGATCTTATATACTGATAGTGAGAACGATATGATGGTCGTTGGCGATGATCCATGGCAT 

GATTTCTGCAATGTGGTGTGGAAGATACACTTATACACGAAAGAGGAAGTGGAGAATGCG 

AATGACGATAACAAGAGTTGTTTAGAGCAAGCTGCTCTCATGATGGAAGCATCAAAGTCA 

TCTTCTGTGAGCCAGCCTGATTCTTCTCCTACAATCACTAGGGTTTGATACCCATAAAGA 

AGCTTATTTCCTATGTTTTAAAGTGTGTTTTGCTCAC^AAAGAACTTCAACTTTATCTTT 

GTCTTTGAATCCATTTATGTGTTTGTTTGTGTTTCTTCTGGTCTCCATGGATGTCTCATG 

TGTACCGTTTTACTeGAGAGATATGTGAGTTTATGGGATGTGTAAAGCATGCCATTGGAT 

TTTAAGGTTTTCAAAATTACAATATATATATTAGTTTTGAAGTTAAAAAAA7VAAAAA 

A 

>G619 Amino Acid Sequence (domain in AA coordinates: 64-406) 

MEFDLNTEIAEVEEEENDDVGVGVGGGTRIDKGRLGISPSSSSSCSSGSSSSSSSTGSAS 

SIYSELWHACAGPLTCLPKKGNVWYFPQGHLEQDAMVSYSSPLEIPKFDLNPQIVCRW 

OTQLLANKDTDEVYTQOTLLPLQEFSMIiNGEGKEVKELGGEEERNGS 

LTASDTSTHGGFSVPRRAAEDCFAPLDYKQQRPSQELIAKDLHGVEWKFRHIYRGQPRRH 

LLTTGWSIFVSQKNLVSGDAVLFIjRDEGGELRLGIRRAARPRNGLPDSIIEKNSCSNILS 

LVANAVSTKSMFHVFYSPRATHAEFVIPYEKYITSIRSPVCIGTRFRMRFEMDDSPERRC 
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AGVVTGVCDLDPYRWPNSKWRCLLVRWDESFVSDHQERVSPWEIDPSVSLPHLSIQSSPR 
PKRPWAGLLDTTPPGNPITKRGGFLDFEESWPSKVLQGQENIGSASPSQGFDVMNRRIL 
DFAMQSHANPVLVSSRVKDRFGEFVDATGVNPACSGVMDLDRFPRVLQGQEICSLKSFPQ 
FAGFSPAAAPNPFAYQANKSSYYPLALHGIRSTHVPYQNPYNAGNQSSGPPSRAINFGEE 
TRKFDAQNEGGLPNNVTADLPFKIDMMGKQKGSELNMNASSGCKL 

SSSKRICTKVHKQGSQVGRAIDLSRIiNGYDDLLMELERLFNMEGLLRDPEKGWRILYTDS 

ENDMMWGDDPWHDFCNVWKIHLYTKEE 

SSPTITRV* 

>G2295 (33.. 917) 

GTAATATATAACAATAACTCAGGTTACAAAGGATGGTTCCGAAAGTGGTCGACCTACAAA 

GGATAGCGAACGATAAGACAAGGATAACAACTTACAAGAAGAGGAAAGCTAGTCTTTACA 

AGAAGGCACAAGAGTTCTCAACTCTCTGCGGCGTCGAGACATGTCTCATCGTCTACGGTC 

CCACGAAGGCTACCGATGTGGTGATTTCCGAGCCAGAGATATGGCCGAAGGACGAGACCA 

AAGTCAGGGCCATCATACGCAAGTACAAAGACACAGTGTCGACCAGCTGCAGGAAAGAAA 

CCAACGTGGAGACTTTCGTCAACGATGTAGGGAAAGGAAACGAGGTGGTGACTAAAAAGA 

GAGTGAAGCGTGAGAATAAGTATTCTAGTTGGGAGGAGAAGCTAGACAAGTGTTCACGAG 

AGCAACTACATGGGATTTTCTGTGCCGTGGATAGCAAGTTAAATGAAGCTGTAACGAGAC 

AGGAGCGTAGTATGTTTAGGGTTAATCATCAAGCCATGGACACACCATTCCCGCAGAATT 

TAATGGACGAACAATTCATGCCACAGTATTTTCATGAGCAGCCACAG 

CTAATAATTTCAATAATATGGGTTTCTCGTTGAT^ 

TGGACCCAAATCTCATGGAGAAGTGGACCGACTTGGCTTTGACTCAAAGCTTGATGATGT 
CAAAGGGAAACGATGGTACTCAATTCATGCAGAGGCAAGAACAACCATACTATAATCGTG 
AACAGGTTGTATCGAGGTCTGCAGGTTTCAATGTTAACCCGTTTATGGGATATCAAGTCC 
CGTTTAATATTCCTAATTGGAGATTATCGGGAAATCAAGTTGAAAATTlSGGAGCTTTCAG 
GGAAGAAAACGATATGATTTGAATTACGGAGCTTTATTAGTTTTTAGGGTTTTATAGTTO 
TG 

>G2295 Amino Acid Sequence (domain in AA coordinates: TBD) 

MVPKWDLQRIANDKTRITTYKKRKASLYKKAQEFSTLCGVETCLIVYGPTKATD 

PEI WPKDETKVRAI I RKYKDTV STS CRKETNVETFVNDVGKGNE WTKKRVKRENKYS S W 

EEKLDKCSREQLHGIFCAVDSKLNEAVTRQERSMFRViraQAm 

HEQPQFQGFPNimn^GFSLISPHDGQIQ^PNLMEKWTDIj^ 

RQEQPYYNREQWSRSAGFNVNPFMGYQVPFNIPNWRLSGNQVENWELSGKKTI * 

>G312 (1..1755) 

ATGGCTTACATGTGCACTGATAGTGGCAATCTAATGGCTATTGCTCAACAAGTCATCAAA 
CAGAAGCAG CAACAAGAACAACAACAGCAGCAACATCATC AAGAC CATC AGATTTTTGGT 
ATTAATCCTTTGTCTCTTAACCCATGGCCCAATACTTCCCTCGGGTTTGGGCTTTCAGGT 
TCGGCTTTTCCCGACCCGTTTCAAGTTACCGGCGGCGGAGATTCCAACGATCCTGGCTTT 
CCTTTTCCTAACTTAGACC^CCACCACGCCAC^CCACCGGCGGTGGGTTCAGGTTATCT 
GATTTCGGCGGTGGAACCGGCGGCGGCGAGTTTGAGTCCGACGAGTGGATGGAGACTCTT 
ATCAGCGGTGGAGACTCCGTTGCAGACGGTCCTGATTGTGACACCTGGCATGATAATCCC 
GATTACGTAATCTACGGTCCTGATCCATTCGATACTTACCCGAGTCGACTCAGTGTCCAA 
CCGTCAGATCTAAACCGAGTCATTGACACGTCGAGTCCGCTTCCTCCGCCGACCTTGTGG 
CCTCCTTCTTCGCCATTATCGATTCCTCCGCTTACTCATGAGTCACCAACCAAAGAAGAT 
CCAGAGACTAACGACTCCGAAGACGATGACTTCGACCTAGAACCACCTCTCCTCAAAGCT 
ATATACGACTGTGCACGGATCTCAGACTCTGACCCTAACGAAGCTTCCAAGACGCTTCTT 
CAGATCCGAGAATCTGTATCGGAGCTAGGTGATCCGACGGAGCGAGTTGCATTTTACTTC 
ACGGAAGCTCTCTCCAACAGACTGTCTCCTAATTCGCCGGCGACGTCGTCTTCTTCTTCA 
TCTACGGAGGATTTAATCTTATCTTATAAAACCCTAAACGACGCTTGTCCTTACTCCAAA 
TTCGCACATTTGACGGCGAATCAAGCGATTCTAGAAGCGACGGAGAAGTCGAACAAGATT 
CACATCGTCGATTTTGGAATCGTTCAAGGTATACAATGGCCTGCTCTTCTTCAAGCTCTA 
GCTACTCGTACTTCTGGTAAACCCACTCAAATCCGGGTCTCGGGTATACCCGCTCCATCT 
CTCGGTGAATCTCCGGAACCGTCGTTAATCGCCACCGGAAACCGCCTCCGTGATTTCGCC 
AAGGTTCTGGATCTGAATTTCGATTTCATCCCAATTCTCACTCCCATACATTTACTTAAC 
GGGTCAAGTTTCCGGGTCGACCCGGATGAAGTACTGGCCGTGAATTTCATGCTCCAGCTC 
TACAAATTACTCGACGAGACGCCGACGATAGTTGACACCGCACTACGGCTCGCCAAATCG 
TTGAACCCGAGGGTCGTCACTCTCGGAGAATACGAAGTGAGCTTAAACCGGGTCGGTTTC 
GCTAACCGGGTAAAGAACGCGCTTCAATTCTATTCCGCGGTTTTCGAATCCCTTGAACCG 
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AACTTGGGGCGTGATTCGGAGGAGAGAGTGAGAGTTGAGCGAGAGTTGTTCGGCCGGAGA 
ATCTCGGGTTTGATTGGACCGGAGAAAACCGGAATTCATAGAGAAAGAATGGAAGAGAAA 
GAGCAATGGCGGGTATTAATGGAGAATGCCGGTTTTGAATCGGTTAAGCTGAGTAATTAC 
GCAGTGAGCCAAGCGAAGATTCTATTGTGGAATTACAATTACAGCAATTTGTATTCAATT 
GTTGAATCTAAGCCTGGCTTCATCTCTTTGGCCTGGAACGATTTACCTCTCCTCACTCTT 
TCTTCCTGGCGATAA 

>G312 Amino Acid Sequence (domain in AA coordinates: 320-336) 

MAYMCTDSGNLMAIAQQVIKQKQQQEQQQQQHHQDHQIFGINPLSLNPWPNTSLGFGLSG 

SAFPDPFQVTGGGDSNDPGFPFPNLDHHHATTTGGGFRLSDFGGGTGGGEFESDEWMETL 

ISGGDSVADGPDCDTWHDNPDYVIYGPDPFDTYPSRLSVQPSDLNRVIDTSSPLPPPTLW 

PPSSPLSIPPLTHESPTKEDPETNDSEDDDFDLEPPLLKAIYDCARISDSDPNEASKTLL 

QIRESVSELGDPTERVAFYFTEALSNRLSPNSPATSSSSSSTEDLILSYKTLNDACPYSK 

FAHLTANQAILEATEKSNKIHIVDFGIVQGIQWPALLQALATRTSGKPTQIRVSGIPAPS 

IiGESPEPSLIATGNRLRDFAKVLDLNFDFIPILTPIHLLNGSSFRVDPDEVLAVNFMLQL 

YKLLDETPTIVOTALRLAKSLNPRWTL^ 

NLGRDSEERVRVERELFGRRISGLIGPEKTGIHRERMEEKEQWRVLMENAGFESVKLSNY 
AVS QAKI LLWNYNYSNL YS I VES KPGF I S LAWNDLPLLTLS S WR * 
>G1444 (192.. 1001) 

AATCCCCTATCCTTCGCAAGACCCTTCCTCTATATAAGGAAGTTCATTTCATTTTT J i"l u i tJ i' 

GACACGCTGACAAGCTGACTCTAGCATATCTGGCACCGGCGACCAGTCCTTCTTTGGTGC 

AAAGATCCCAAAAAATCAAAATCGAAAGAGAGAATAAATCAAAAGGAAGAATCTTTATCT 

GCTTTCTCTCGATGAGGATCCGGAAACGACAAGTGCCTCTTCCTTTATCGTCTCTATTAC * 

CAGTTCCTCTATCAGATCTCTACTTTAACCGCTCACCGACGGCCACCGCGAGATACTTTC 

GCGGTGGTTATAAAGACGGCGGTGATGATTTTGGTTCTCTTCAGCTTTCGCTTCCGCCGC 

CGTCGCAGATTTCTGATCGGCTTATTCAAAGAGATTTGATAAAGAAGAAGGAGGAGGTCA 

AGGCTTTGGATGATGATAATGGTGATGTAGACGTCAAGAGTCGTACTGATGCATCGGGCA 

GCAAGAATGTTAATCCCCGAGGAGAATCCGTCTCTTCAATACAAGTTGTCGAGAAGAATG 

AAAAGGTTGTGTCTTTGAGGAAGAGAAGAGGCTTTATCAACTTTGAGGATTACGAAGATG 

AGGAAGATGAAGAAGCTAGTGGCGGTGGAGGCCGTATTAATAAAGGGAAAAAGAAAGCGA 

AAAAGAGCGGTGGTGGGTTAGAGGAAGGATCACGGTGCAGCCGTGTTAACGGTAGAGGAT 

GGAGATGTTGTCAGCAAACGCTTGTTGGTTATTCTCTTTGTGAGCATCATCTCGGTAAAG 

GAAGGGTAAGGAGCATGAACAAGAGTGGTGGTGGTCGTGGCGGCGAGAAAAAGGCGGTGG 

TGGTGGAAGTGAAGAAGAAGAGAGTAAAGCTTGGCATGGTAAAGGCACGTTCAATAAGTA 

GTTTGCTTGGACAAACCAGCACTAGTGGTGGTACTAGTGGTGATGTTGATCAGGGTGAGA 

TAAGTGCACCTGCTGATCAGTTCGCTGCATGTGATAAGTAGGTCTGTTGATCAGCATTTG 

CATGTATATGGATATGTGTATGTTTATGTACATGATGATAATGGGCATAGCGCGGCCGCT 

CTAGACAGGCCTGGAACCGGATCCTCTAGCTAGAGCTTTCGTTAGTATCATCGGGTTTAG 

ACAACGTT 

>G1444 Amino Acid Sequence (domain in AA coordinates: 168-193) 

MRIRKRQVPLPLSSLLPVPLSDLYFNRSPTATARYFRGGYKDGGDDFGSLQLSLPPPSQI 

SDRLIQRDL IKKKEEVKALDDDNGDVDVKSRTDASGSKNWPRGESVSS IQVVEKNEKVV 

SLRKRRGFINFEDYEDEEDEEASGGGGRINKGKKKAKKSGGGLEEGSRCSRVNGRGWRCC 

QQTLVGYS LCEHHLGKGRWSMNKSGGGRGGEKKAVVVEVIGaCRVKLGMVKARS I S S LLG 

QTSTSGGTSGDVDQGEISAPADQFAACDK* 

>G801 (27.. 746) 

GATAGTGATAACGAAATCCTAATTCCATGGCCGACAACGACGGAGCAGTGAGTAACGGCA 
TCATAGTCGAGCAGACGTC/^AACAAAGGACCTCTTAACGCCGTTAAGAAACCACCGTCTA 
AAGATCGACACAGC^tAAGTTGACGGAAGAGGAAGAAGGATTCGTATGCCAATCATTTGCG 
CAGCTCGAGTTTTTCAATTGACCAGAGAGTTAGGTCACAAGTCCGATGGTCAAACCATAG 
AGTGGCTTCTCCGTCAAGCTGAGCCTTCTATCATAGCCGCCACTGGAACTGGCACTACTC 
CGGCGAGTTTCTCCACTGCTTCTCTCTCCACT^ 

TCGTCAGAGCGGAGGAAGGAGAATCCGGCGGCGGAGGAGGAGGAGGGTTAACAGTGGGAC 
ACACAATGGGGACTTCGTTAATGGGTGGTGGTGGTTCTGGTGGGTTTTGGGCTGTTCCGG 
CGAGGCCGGATTTCGGAC7^AGTCTGGAGCTTTGCAACCGGAGCTCCACCGGAAATGGTTT 
TTGCGCAGCAGCAGCAACCAGCTACACTCTTCGTCCGCCACC^GCAGCAACAGCAAGCTT 
CCGCCGCCGCAGCAGCTGCAATGGGTGAGGCTTCAGC^GCTAGAGTTGGGAATTATCTTC 
CGGGTCATCATCTCAATTTGCTTGCTTCTTTGTCTGGTGGAGCTAACGGGTCGGGTCGGA 
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GGGAAGACGACCACGAACCACGTTGAGAAATGGTATTGTOTTTTTGGTAATGTATAGAAA 

AATTCCTATGTTTTATGTCATCGAAAGTGTTTAGAAAGTACCTCTAATTTGCGGTTTCTT 

TTGCTCCTTTTTTACTTAATTTAAGCTTATTGCTTGTTTGATTAGGGTTTTAGGGTTTAA 

GAATATTTGGTCTCGTTAATTTGTTTCGGAGAGTGATAGAAAGAGAGAGAGATTGATTGA 

TTGTTGTACCTAAAACGCTATAAAAGCTCTGTTTTTACTAGCGAAAAAA 

>G801 Amino Acid Sequence (domain in AA coordinates: 32-93) 

MADNDGAVSNGI IVEQTSNKGPLNAVKKPPSKDRHSKVDGRGRRIRMPI ICAARVFQLTR 

ELGHKSDGQTIEWLLRQAEPSIIAATGTGTTPASFSTASLSTSSPFTLGKRWRAEEGES 

GGGGGGGLTVGHTMGTSLMGGGGSGGFWAVPARPDFGQVWSFATGAPPEMVFAQQQQPAT 

LFVRHQQQQQASAAAAAAMGEASAARVGNYLPGHHLNLLASLSGGANGSGRREDDHEPR* 

>G1950 (42.. 764) 

CTGAATTCGAACTTTGGAAGAAGAAGAAGCTTTGATCAATCATGGAAATTGCAACCGATA 
CAGCAAAGCAGATGAGAGACGAAGAGTTGTTCAAAGCAGCGGAATGGGGAGATTCATCGT 
TGTTCATGTCATTATCTGAAGAACAGCTCTCTAAATCTCTCAATTTCAGAAACGAAGATG 
GTCGCTCTCTCCTCCATGTCGCTGCTTCCTTCGGCCATTCTCAAATAGTGAAGTTGTTAT 
CAAGTTCAGATGAAGCAAAGACTGTAATCAATAGCAAGGATGATGAAGGATGGGCTCCTO 
TGCAT^CCGCTGCTAGCATCGGTAATGCTGAGCTCGTTGAGGTGCTTTTGACCAGAGGTG 
CTGATGTCAATGCCAAAAATAACGGTGGTCGCACTGCTCTTC^CTATGCTGCTAGCAAAG 
GCCGGTTGGAGATTGCTCAGCTTTTATTAACACACGGTGCAAAGATTAACATCACAGACA 
AGGTTGGTTGCACTCCGCTTCACAGGGCAGCAAGCGTGGGAAAGTTAGAAGTTTGTGAAT 
TTCTTATTGAAGAAGGAGCAGAGATCGATGCTACGGATAAAATGGGTCAAACTGCACTCA 
TGCATTCAGTTATCTGCGATGACAAACAGGTTGCGTTCCTGCTTATAAGACATGGTGCAG 
ATGTGGATGTAGAAGACAAGGAAGGCTACACTGTTCTAGGCCGAGCTACCAATGAATTCC 
GACCTGCACTTATCGATGCTGCTAAGGCCATGCTTGAAGGATAAAATGACTCTGGATTAC 
TTTAAAACTTACTAACTCTGAGAGTTGTTTAGTTACTTAAAAGGATTTTTCTTTACTGTA 
TCATGTTTGCAAAATGTTTCTGCCTTATCAATTCATGTTCTGT 

>G1950 Amino Acid Sequence (domain in AA coordinates: 65-228) 
MEIATDTAKQMRDEELFKAAEWGDS S LFMSLSEEQLS KS LNFRNEDGRS LLHVAAS FGHS 
QI VKLLSS SDEAKTVINSKDDEGWAPLHS AAS I GNAELVEVLLTRGADVNAKNNGGRTAL 
HYAASKGRLEI AQLLLTHGAKINI TDKVGCTPLHRAAS VGKLEVCE FL I EEGAE IDATDK 
MGQTALMHS VI CDDKQVAFLLIRHGADVDVEDKEGYTVLGRATNEFRPAL IDAAKAMLEG 
* 

>G958 (55.. 1950) 

CGTCGACATGTTCATATTTGTTTCTAGCTAAGAAGTTTGTATAAGGCAGTGGACATGGCT 
CCTGTTTCAATGCCTCCAGGTTTCCGGTTTCATCCAACAGACGAAGAGCTTGTCATATAC 
TACCTCAAGCGAAAGATTAATGGTCGGACTATTGAGTTAGAGATAATACCCGAGATTGAT 
CTTTACAAATGCGAACCTTGGGATTTACCTGGGAAGTCCTTGCTGCCAAGTAAAGACCTA 
GAATGGTTCTTTTTCAGTCCTCGAGACCGGAAATATCCAAACGGATCAAGAACAAACCGG 
GCGAC CAAAGCAGGTTACTGGAAAGCC AC CGGGAAAGATCGTAAAGTGACTTCAC ATTCA 
CGGATGGTTGGAACAAAGAAAACATTAGTTTATTACCGAGGAAGAGCGCCTCATGGCTCT 
CGTACCGATTGGGTCATGCACGAGTACCGTCTTGAAGAACAAGAATGTGACTCTAAATCC 
GGTATACAGGATGCCTATGCACTTTGTCGAGTATTTAAGAAGAGTGCTTTAGCCAACAAA 
ATTGAAGAACAACACCATGGTACGAAGAAGAACAAAGGAACGACTAATAGTGAACAATCT 
ACTTCTAGTACTTGTTTGTATTCTGATGGAATGTATGAAAACCTCGAAAACTCGGGGTAT 
CCAGTCTCACCTGAGACAGGAGGCTTAACTCAACTCGGTAATAATTCGTCGTCGGATATG 
GAAACGATAGAGAATAAATGGAGTCAGTTTATGTCGCATGACACGTCCTTCAACTTCCCA 
CCTGAGTCTCi\ATATGGAACAATCTCATATCCTCCCTCGAAGGTTGATATAGCGTTAGAG 
TGTGCAAGACTACA3tAATCGTATGTTGCCACCAGTACCACCACTTTACGTAGAAGGTCTC 
ACACACAATGAATATTTTGGAAACAATGTAGCTAACGATACAGATGAAATGTTGAGCAAG 
ATTATAGCATTGGCTCAAGCCTCACATGAGCCACGAAACAGTCTAGACTCATGGGACGGT 
GGTTCTGCTTCCGGGAACTTCCATGGAGACTTTAACTATTCCGGAGAAAAAGTCTCATGC 
CTAGAGGCGAACGTGGAGGCTGTAGATATGCAAGAACACCATGTGAATTTTAAGGAAGAA 
AGACTTGTTGAAAACTTGAGATGGGTAGGAGTATCAAGCAAGGAACTTGAAAAGAGCTTC 
GTTGAAGAACACTCAACGGTAATTCCTATAGAAGATATTTGGAGATATCATAATGATAAT 
CAAGAACAAGAACATCATGATCAAGATGGTATGGACGTTAACAACAACAATGGAGATGTG 
GATGATGCTTTCACACTCGAGTTTTCGGAAAACGAACATAACGAGAATCTTTTGGACAAG 
AACGATCATGAGACAACGAGTTCCTCATGTTTTGAGGTGGTAAAAAAAGTTGAGGTTAGC 
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CATGGATTGTTTGTCACAACTCGTCAGGTAACC^CACATTCTTCCAACAGATAGTACCA 
TCGCAAACCGTTATAGTTTATATAAATCCGACGGATGGCAATGAGTGTTGTCATAGTATG 
ACATCAAAAGAGGAGGTTCATGTCCGTAAAAAGATAAATCCGCGAATCAACGGAGTAAGC 
TCAACAGTTCTTGGACAATGGAGAAAATTCGCGCATGTTATTGGCTTCATTCCTATGCTT 
CTATTGATGCGTTGTGTTCATCGAGGTAACTCTAACAAAAACAGAGGCAGTGAAGGTTAC 
TCGAGGCAGCCTACGAGAGGAGATTGTAACAATCGGGGAACAATACTCATGATGGAAAAT 
GCTGTCGTGAGAAGAAAAATTTGGAAGAAGAAGAAAGAGAAAAATATGGTTGACGAACAA 
GGTTTTCGGTTTCAAGATAGTTTCGTATTGAAGAAGTTGGGGCTTTCTCTTGCTATCATC 
TTAGCTGTTTCTACCATAAGTCTTATTTGAATACTGAGGTTCAATATATCATATATGGCT 
TTTCACTTTTCTATTGTACTCCCATTTGCCTAGGTCGTATGC 

>G958 Amino Acid Sequence (conserved domain in AA coordinates : 7-156) 
MAPVSMPPGFRFHPTDEELVIYYLKRKINGRTIELEIIPEIDLYKCEPWDLPGKSLLPSK 
DLEWFFFSPRDRKYPNGSRTNRATKAGYWKATGKDRKVTSH^ 
GSRTDVAfMHEYRLEEQECDSKSGIQDAYALCRVFKK^ 

QSTSSTCLYSDGMYENLENSGYPVSPETGGLTQLGNNSSSDMETIENKWSQFMSHDTSFN 

FPPQSQYGTISYPPSKTOIALECARLQ3STRMLPPVPPLYVEGLTHNEYFGNNV 

SKIIALAQASHEPRNSLDSWDGGSASGNFHGDFNYSGEKVSCLEANVEAV1)MQEHHW 

EERIVVENLRWGVSSKELEKSFVEEHSTVIPIEDIWRYHOT^ 

DVDDAFTLEFSENEHNENLLDKITOHBTTSSSC^ 

VPSQWIWINPTDGNECCHSMTSKEEVHVRKKI^ 

MLLLMRCVHRGNSNKNRGSEGYSRQPTRGDCN^^ 

EQGFRFQDSFVLKKLGLSLAIILAVSTISLI* 

>G1037 (1..1722) 

ATGACTGTTGAACAAAATTTAGAAGCTTTGGATCAGTTTCCTGTAGGAATGAGAGTTCTT 

GCTGTTGATGATGACCAAACTTGTCTCAAAATCCTTGAATCTCTCCTTCGTCACTGCCAA 

TACCATGTAACAACGACGAACCAAGCACAAAAGGCTTTAGAGTTATTGAGAGAGAACAAG 

AACAAGTTTGATCTGGTTATTAGTGATGTTGAC^TGCCTGACATGGATGGTTTCAAACTC " 

CTTGAGCTTGTTGGTCTTGAAATGGACCTACCTGTCATAATGTTGTCTGCGCATAGTGAT 

CCAAAGTATGTGATGAAGGGAGTTACTCATGGTGCTTGTGATTATCTACTGAAGCCGGTT 

CGTATTGAGGAGTTGAAGAACATATGGCAACATGTCGTGAGAAGTAGATTTGATAAGAAC 

CGTGGGAGTAATAATAATGGTGATAAGAGAGATGGATCAGGTAATGAAGGTGTTGGGAAT 

TCTGATCCGAACAATGGGAAAGGTAATAGAAAACGTAAAGATCAGTATAATGAAGATGAG 

GATGAGGATAGAGATGATAATGATGATTCGTGTGCTCAAAAGAAGCAACGTGTTGTTTGG 

ACTGTTGAGCTGCATAAGAAATTTGTTGCAGCTGTTAACCAATTGGGATATGAGAAGGCT 

ATGCCTAAAAAGATTTTGGATCTGATGAATGTTGAGAAGCTCACTAGAGAAAATGTGGCC 

AGTCATCTrCAGAAATTCCGCCTTTACTTGAAGAGGATCAGTGGTGTGGCTAATCAGCAA 

GCTATTATGGCAAACTCTGAGTTACATTTTATGCAAATGAATGGACTTGATGGTTTCCAT 

CACCGCCCAATCCCTGTTGGATCTGGTCAGTACCATGGTGGGGCTCCTGCAATGAGATCT 

TTCCCTCCAAACGGGATTCTTGGCAGACTCAATAGCTCTTCGGGGATCGGTGTCCGCAGC 

CTTTCTTCTCCTCCTGCAGGAATGTTCTTGCAAAACCAGACCGATATCGGAAAGTTTCAC 

CATGTCTCATCACTTCCTCTTAACCACAGTGATGGAGGAAACATACTTCAAGGGTTGCCA 

ATGCCTTTAGAGTTCGACCAGCTTCAGACAAACAACAACAAAAGTAGAAACATGAACAGT 

AACAAGAGCATTGCTGGGACCTCCATGGCTTTTCCTAGCTTCTCTACGCAACAAAACTCG 

CTCATCAGTGCTCCTAATAACAATGTCGTGGTTCTAGAAGGTCACCCACAAGCAACTCCT 

CCAGGCTTCCCAGGACACCAGATCAATAAACGTTTGGAGCATTGGTCAAATGCTGTATCC 

TCTTCGACTCACCCTCCTCCCCCGGCACATAACAGTAATAGTATCAATCATCAGTTCGAT 

GTCTCTCCATTACCGCATTCTAGACCCGACCCCTTGGAATGGAACAATGTGTCATCAAGC 

TACTCTATACCATTCTGTGACTCTGCCAATACATTGAGTTCTCCAGCCTTGGATACAACA 

AATCCCCGAGCTTTCTGTAGAAACACGGACTTCGATTCAAACACAAATGTGCAACCTGGA 

GTCTTTTATGGTCCATCCACGGATGCTATGGCTCTGTTGAGTAGTAGTAACCCGAAAGAA 

GGGTTCGTCGTAGGCCAACAGAAGTTACAGAGTGGTGGATTCATGGTTGCAGATGCTGGT 

TCCTTAGATGATATAGTCAACTCCACGATGAAGCAGGTGTGA 

>G1037 Amino Acid Sequence (domain in AA coordinates: 11-134, 200-248) 

MTVEQNLEALDQFPVGMRVIjAVIDDDQTCLKILESLIjRHCQYHVTTTNQAQKALELLRENK 

NKFDLVISDVDMPDMDGFKLLELVGLEmLPVIMLSAHSDPKYVMKGVTHGACDYLLKPV 

RIEELKNIWQHVVRSRFDKNRGSNNlK5DK3tf)GSGNEG 

DEDRDDNDDSCAQKKQRWWTVE^ 
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SHLQKFRLYliKRISGVANQQAIMANSELHFMQMNGLDGFHHRPIPVGSGQYHGGAPAMRS 

FPPNGILGRLNSSSGIGVRSLSSPPAGMFLQNQTDIGKFHHVSSLPLNHSDGGNILQGLP 

MPLEFDQLQTNNNKSRNMNSNKSIAGTSMAFPSFSTQQNSLISAPNN1WVVLEGHPQATP 

PGFPGHQINKRLEHWSNAVSSSTHPPPPAHNSNSINHQFDVSPLPHSRPDPLEWNNVSSS 

YS I PFCDSANTLS S PALDTTNPRAFCRNTDFDSNTNVQPGVFYGPSTDAMALLS S SNPKE 

GFWGQQKLQSGGFMVADAGSLDDIVNSTMKQV* 

>G2065 (33.. 1124) 

AACCACACAAAACAAAACAAAAAAACATATTGATGGGGATGAAGAAGGTAAAGCTATCTT 
TGATAGCTAATGAAAGATCAAGGAAAACATCCTTCATGAAGAGGAAAAACGGGATATTCA 
AGAAACTCCACGAGTTGTCAACTCTATGTGGTGTCCAAGCTTGTGCTCTCATCTATAGTC 
CATTCATACCGGTTCCAGAGTCATGGCCGTCAAGGGAAGGTGCTAAAAAGGTAGCTTCAA 
AGTTTCTGGAGATGCCGCGGACAGCCCGAACCAGGAAGATGATGGATCAAGAAACCCATC 
TTATGGAGAGGATTACCAAAGCAAAAGAGCAACTAAAGAATTTGGCTGCTGAGAACCGAG 
AATTACAGGTTAGACGATTTATGTTTGATTGTGTTGAAGGCAAAATGTCCCAGTATCGTT 
ATGATGCAAAAGACCTTCAAGATTTGCTATCTTGTATGAATCTATATCTCGATCAGCTTA 
ACGGAAGGATCGAGTCCATTAAAGAAAACGGTGAGTCGTTGTTGTCTTCCGTCTCTCCTT 
TTCCTACTAGAATTGGTGTTGACGAAATTGGTGATGAGTCGTTTTCCGACTCTCCTATTC 
ATTCTAC^CTAGGGTTGTAGATACTCCTAATGCTACCAATCCTCATGTTCTTGCGGGCG 

ATCATATTCAATATGAAAATATGAATATGAGTCAAAATCTGCATGAACCGTTTCAACACC 
TTGTTCCTACTAACGTTTGTGATTTTTATCAAAATCAGAATATGAATCAGGTTCAATACC 
AGGCTCCTAATAATCTGTTTAATCAGATTCAACGAGAATTCTACAACATAAATTTGAATC 
TGAATTTGAATCTGAATTCAAATCAGTATCTGAATCAACAACAATCATTCATGAATCCGA 
TGGTGGAACAACATATGAATCATGTTGGAGGGCGTGAAAGCATTCCTTTCGTGGACAGAA 
ACTACTACAACTACAATCAACTACCAGCCGTTGATCTTGCTTCCACCAGTTACATGCCTT 
CAACCACCGATGTTTATGATCCTTACATCAACAACAATCTCTAATCACAAAAGACGGAGA 
TTTTCTAGTTTAA 

>G2065 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGMKKVKLSLIANERSRKTSFMKRKNGIFKKLHELSTLCGVQACALIY 

REGAKKVASKFLEMPRTARTRKMMDQETHLMERITK^ 

VEGKMSQYRYDAKDLQDLLSCMNLYLDQLNGRIESIKENGESLLSSVSPFPTRIGVDEIG 

DESFSDSPIHSTTRVVDTPNATNPHVLAGDMTPFI£)ADA^ 

QNLHEPFQHLVPTNVCDFYQNQNI^QVQYQAPN^ 

NQQQSFMNPMVEQHMNHVGGRESIPF 

NWL* 

>G2137 (77.. 1123) 

GGGATTTGACTTTAGCACTTCAAAATCCAAAGCTAAAAGACAAAAAAGAATAGAGGTTCG 
ATTTGCATCTCCATTAATGGGCATCGATCTTTCTCTTAAGCTCGAGGCCGAGGAGAAAAA 
GAAAGAGATAGAAGGATCGAAACATAGCCGTGAGAACAAAGAAGACGAAGAACATGATGC 
TAGTGGTGATGAAGATGAACAAATGGTGAAAGAAGACGAAGATGATTCTTCTTCTTTAGG 
TTTAAGAACCCGAGAAGAAGAAAACGAACGTGAAGAGCTCTTGCAGCTACAGATCCAGAT 
GGAAAGTGTGAAAGAAGAGAATACTAGGTTGAGGAAGCTTGTCGAGCAGACTCTTGAAGA 
TTATCGTCATCTTGAGATGAAATTCCCGGTTATCGATAAAACCAAGAAGATGGATCTTGA 
AATGTTCCTTGGAGTACAAGGCAAACGATGTGTGGATATAACAAGTAAGGCTCGGAAAAG 
AGGAGCTGAGAGATCTCCGTCAATGGAAAGAGAAATAGGGCTTTCACTTTCTCTAGAGAA 
AAAACAGAAACAAGAAGAGAGCAAAGAAGCTGTTCAGTCTCATCACCAAAGATACAATAG 
TAGC^GCTTAGATATGAATATGCC^CGTAT(^TTTCATCTTCTCiUVGGTAATAGAAAGGC 
CAQGGTGTCCGTGAQGGCGAGATGTGAGACCGCAACAATGAATGATGGATGCCAATGGAG 
GAAGTACGGTCAGAAAACCGCGAAAGGGAATCCATGTCCTCGAGCTTATTACCGATGCAC 
CGTGGCTCCAGGATGTCCCGTTAGAAAACAGGTGCAAAGGTGTTTAGAAGACATGTCAAT 
ACTGATAACAACCTACGAAGGAACACATAACCATCCACTTCCGGTCGGAGCAACAGCCAT 
GGCTTCCACTGCCTCTACTTCTCCATTCTTGTTACTCGATTCCAGTGACAACCTCTCTCA 
TCCTTCCTATTACCAAACTCCTCAAGCCATAGACTCTTCTTTGATTACATACCCACAAAA 
TAGCAGCTACAACAATCGAACCATAAGAAGCTTGAACTTTGATGGTCCATCTAGAGGAGA 
TC^CGTTTCATCTTCTCAAAACCGATTAAATTGGATGATGTAGAGTTTCCTATATCTCTA 
TGCTTGTTCTTTGGTCCCATTATTTGTCATTATGGATTCTTTGCCTTTCTTCTTGTTCTC 
GTTTCTAACATTTATGTTTCGTATA 
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>G2137 Amino Acid Sequence (conserved domain in AA coordinates : 109-168) 

MGIDLSLKLEAEEKKKEIEGSKHSRENKEDEEHDASGDEDEQMVKEDEDDSSSLGLRTRE 

EENEREELLQLQIQMESVlCEENTRIiRKLVEQTLEDYRHLEMKFPVIDKTKKM^ 

QGKRCV1DITSKARKRGAERSPSMEREIGLSLSLEKKQKQEESKEAVQSHHQRYNSSSLDM 

NMPRIISSSQGNRKARVSVRARCETATMNDGCQWRKYGQKTAKGNPCPRAYYRCT^ 

PVRKQVQRCLEDMSILITTYEGTHNHPLPVGATAMASTASTSPFLLLDSSDNLSHPSYYQ 

TPQAIDSSLITYPQNSSYNNRTIRSIiNPDGPSRGDHVSSSQNRLNWMM* 

>G746 (1. .1311) 

ATGGGTGAGGAGTTAGCTGACACAATGAACCTGGATTTGAATCTTGGGCCTGGTCCTGAG 

TCTGATCTCCAACCTGCACCAAACGAGACTGTGAATTTGGCTGATTGGACTAATGACCCG 

CCTGAGAGATCTTCTGAAGCTGTGACAAGGATCAGGACTCGGCATAGGACACGGTTCAGA 

CAGCTTAATCTCCCGATCCCGGTTCTATCTGAAACCCATACCATGGCTATAGAGCTCAAC 

CAGTTGATGGGAAATTCTGTAAATAGAGCTGCTATGCAGACTGGTGAGGGTAGTGAAAGA 

GGCAATGAGGATTTGAAAATGTGTGAGAATGGCGATGGAGCCCTTGGGGACGGTGTATTG 

GATAAGAAAGCGGATGTCGAGAAAAGCAGTGGCAGCGACGGTAACTTTTTCGATTGTAAT 

ATATGTTTGGATTTGTCGAAGGAGCCGGTTCTCACCTGTTGTGGTCATCTTTACTGTTGG 

CCTTGTCTGTACCAATGGTTACAAATTTCGGATGCAAAGGAATGTCCTGTTTGTAAAGGA 

GAGGTGACCTCCAAAACCGTGACACCGATCTATGGACGTGGAAACCACAAGAGAGAAATT 

GAAGAGAGTTTAGATACTAAGGTCCCCATGAGACCACACGCGAGACGCATTGAGAGCTTG 

AGGAATAGAATTCAAAGGTCGCCTTTTACAATACCAATGGAAGAAATGATTAGACGTATA 

CAGAATAGGTTTGACAGGGATTCAACCCCAGTCCCTGATTTTAGTAACCGAGAGGCATCA 

GAAAGAGTCAACGATCGAGCCAATTCGATCCTTAACCGGTTGATGACATCTAGGGGAGTT 

AGATCAGAGCAGAACCAGGCTAGTGCTGCAGCAGCAGCCATTGTCGCAGCATCAGAGGAT 

ATTGATCTAAATCCAAAC^TTGCTCCTGATCTTGAAGGAGAAAGCAAC^CGAGATTCCAT 

CCTCTCTTGATCAGGAGACAGTTACAGTCGCACCGAGTTGCAAGGATCTCGACTTTCACT 

TCTGCGTTGAGTTCAGCTGAGAGGCTTGTGGATGCGTATTTTAGGACTCATCCGTTGGGG 

AGGAACCACCAAGAGCAAAACCATCATGCTCCTGTTGTGGTTGATGATAGAGACTCATTC 

TCAAGCATTGCAGCTGTTATAAACTCTGAGAGTCAAGTGGATACTGCAGTTGAGATCGAT 

TCTATGGCTCTTTCGACATCGTCCTCGAGGAGAAGGAATGAGAATGGTTCGAGGGTTTCT 

GATGTAGACAGTGCAGATTCTCGTCCGCCTAGGAGAAGGAGATTTACTTGA 

>G746 Amino Acid Sequence {domain in AA coordinates: 139-178) 

MGEELADTMllfoDLNLGPGPESDLQPAPNET^ 

QLNLPIPVIjSETHTI^IELNQLMGNSvIjIRAAMQTGEGSERGNEDLKMCENGD 

DKKADV^KSSGSDGNFFDCNICLDLSKEPVTiTCCGHLYCWPCLYQWLQISDAKECPVCKG 

EvTSKTVTPIYGRGNHKREIEESLDTKVPMRPHARRIESLRNTIQRSPFTIPMEEMIRRI 

QNRFDRDSTPVPDFSNREASERViroRANSILNRLMTSRGVRSEQNQASAAAAAIVAASED 

IDLNPNIAPDLEGESNTRFHPLLIRRQLQSHRVARISTFTSALSSAERLVDAYFRTHPLG 

RjmQEQNHHAPVWDDRDSFSS I AAVINSESQ TO^ 

DVDS AD SRPPRRRRFT * 

>G2701 (46.. 837) 

GTGTTTGTAGTTGAAACTTATTCTTCCCTTTTTTTGTTTTTAGGTATGGAGACTCTGCAT 
CCATTCTCTCACCTACCTATCTCTGACC^CCGGTTCGTTGTTCAAGAGATGGTGAGCTTA 
CACAGCTCGAGTAGCGGTAGCTGGACTAAAGAAGAGAACAAGATGTTCGAACGAGCTCTT 
GCGATATACGCTGAAGACTCGCCTGATCGCTGGTTTAAAGTTGCTTCCATGATCCCTGGA 
AAGACTGTTTTTGATGTTATGAAGCAATATAGTAAGCTTGAAGAAGACGTTTTCGATATT 
GAAGCAGGACGTGTTCCCATTCCTGGTTATCCTGCAGCTTCTTCTCCCTTGGGGTTTGAC 
ACGGACATGTGTCGTAAACGGCCTAGTGGAGCTAGAGGATCTGATCAAGATCGAAAGAAA 

ggagtcccttggacagaggaagaacacaggagattcttgttaggccttctcaagtacggt 
aaaggagattggagaaacatatcgagaaacttcgtggtgtcaaagacgccaacgcaagtg 
gcgagccacgcccaaaagtattaccagagacagctctccggagccaaggacaaacgcagg 
ccaagtatccatgacatcacaaccggcaatcttctcaatgccaatctcaaccgttccttt 
'tccgatcatagagatattctccctgatttagggtttatcgataaggatgatacggaggag 
ggagtaatatttatgggtcagaatctctcttcagaaaatctgttttctccatcaccaact 

TCATTCGAAGCTGCCATTAACTTCGCCGGAGAAAATGTCTTCAGTGCCGGAGCTTAAGGC 
AACATAGAATCCCCAAACTCAGCGGC 

>G2701 Amino Acid Sequence (domain in AA coordinates: 33-81, 129-183) 
METLHPFSHLPISDHRFWQEMVSLHSSSSGSWTKEENKMFERALAIYAEDSPDRWFKVA 
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SMIPGKTVFDVMKQYSKLEEDVFDI EAGRVPIPGYPAASS PLGFDTDMCRKRPSGARGSD 
QDRKKGVPWTEEEHRRFLLGIiliKYGKGDWRNISRNFWSKTPTQVASHAQKYYQRQLSGA 
KDKRRPSIHDITTGlSnjIiNANLNRSFSDHRDILPDLGFIDKDDTEEGVIFMGQNLSSENLF 
SPSPTS FEAAINFAGENVFS AGA* 
>G1819 (1..639) 

ATGGAAGAGAACAACGGCAACAACAACCACTACCTGCCGCAACCATCGTCTTCCCAACTG 
CCGCCGCCACCATTGTATTATCAATCAATGCCGTTGCCGTCATATTCACTGCCGCTGCCG 
TACTCACCGCAGATGCGGAATTATTGGATTGCGCAGATGGGAAACGCAACTGATGTTAAG 
CATCATGCGTTTCCACTAACCAGGATAAAGAAAATCATGAAGTCCAACCCGGAAGTGAAC 
ATGGTCACTGCAGAGGCTCCGGTCCTTATATCGAAGGCCTGTGAGATGCTCATTCTTGAT 
CTCACAATGCGATCGTGGCTTCATACCGTGGAGGGCGGTCGCCAAACTCTCAAGAGATCC 
GATACGCTCACGAGATCCGATATCTCCGCCGCAACGACTCGTAGTTTCAAATTTACCTTC 
CTTGGCGACGTTGTCCCAAGAGACCCTTCCGTCGTTACCGATGATCCCGTGCTACATCCG 
GACGGTGAAGTACTTCCTCCGGGAACGGTGATAGGATATCCGGTGTTTGATTGTAATGGT 
GTGTACGCGTCACCGCCACAGATGCAGGAGTGGCCGGCGGTGCCTGGTGACGGAGAGGAG 
GCAGCTGGGGAAATTGGAGGAAGCAGCGGCGGTAATTGA 

>G1819 Amino Acid Sequence (domain in AA coordinates: 46-188) 
MEENNGNNNHYLPQPSSSQLPPPPLYYQSMPLPSYSLPLPYSPQMRNYWIAQMGNATDVK 
HHAFPLTRIKKIMKSNPEVNIWTAEAPV^ 

DTLTRSDISAATTRSFKPTFLGDWPRDPSWTDDPVLHPDGEVLPPGTVIGYPVFDCNG 

VYASPPQMQEWPAVPGDGEEAAGEIGGSSGGN* 

>G1227 (372. .1451) 

TCTTCCGTGTGTTAACAGAAGTCCCCACAATTGTCTGTCTTCGCTGCGAGACAAAACTGC 

CACAGCCAATAATGTTTCTCTGAGGGACCTTGCTTCTGTCAGAGACTCGCTCTCTCTCTC 

CTCTTCTTGCTCTGCTCAGCTCTCTCACCAACTCATCTTCAGTCCTCAAAC^VAACATCTG 

TTCTCATCTTTGTTTTCTTTCCTT^ 

TCTTCAACATCTTCATAGCAATTTAAGACC^ 

AAACTCCTCACATTTATTTCTTCCCCATCATTGTTTTAGAGAGGGAGAAAGAAAAAGAGC 
TC^GCTTTCTGATGGAGAGGAGTATTCAAGGACAAAACAAGCTCTGTTGTTTGGACCA^ 
AAGTGAATGTGAGAAGAAGCCTACAAGTTCAAGAAACTGTAGAGGATCATCAAAGCTTTG 
CCCTTGAAGAGGAAGAACAACAACTCTCAACTCCGAGCTTGCTGCAAGACACAACAATAC 
CATTTCTACAAATGCTGCAACAAAGTGAAGACCCTTCACCGTTTTTGTCATTCAAAGACC 
CAAGCTTTCTAGCACTACTATCTCTCC^GACACTTGAAAAGCCTTGGGAACTCGAAAACT 
ACCTCCCACATGAAGTTCCAGAGTTTCATTCACCGATCCATTCTGAAACCAACCACTACT 
ATCATAATCCATCTTTGGAAGGAGTCAATGAAGCCATCTCAAACCAAGAACTTCCATTCA 
ACCCACTAGAGAATGCGCGTTCAAGACGCAAGCGGAAAAACAACAACTTGGCATCATTGA 
TGACAAGAGAAAAGCGAAAGAGAAGAAGAACTAAACCAACAAAGAACATAGAAGAGATAG 
AGAGTCAAAGAATGACACACATTGCGGTTGAACGAAACCGCAGACGCCAAATGAACGTTC 
ATCTGAACTCACTCCGCTCCATCATTCCATCTTCATACATCCAGAGGGGAGACCAAGCGT 
CAATAGTAGGAGGAGCAATAGACTTCGTAAAGATCCTAGAGCAACAGTTGCAATCCCTTG 
AAGCACAAAAGAGAAGTCAACAGAGTGATGATAACAAAGAGCAAATTCCAGAAGATAACA 
GTCTCAGGAACATTTCGTCGAACAAGTTGCGTGCGAGTAATAAAGAAGAACAAAGTAGCA 
AACTCAAAATCGAAGCCACAGTGATAGAGAGTCACGTCAACCTAAAAATTCAATGTACGA 
GGAAACAAGGACAACTTCTCAGATCAATCATATTGCTGGAGAAACTTCGATTCACTGTTC 
TTC ATCTCAACATCACATCTC CGACCAATACATCTGTCTCTTATTCCTTCAAC CTCAAGA 
TGGAAGATGAATGTAATTTGGGATCAGCGGATGAGATAACGGCGGCGATTCGTCAGATTT 
TCGACAGCTGATTGACTAATCCAAGTAAAAAGTAAAATAAAAAAAGAAACGTTTACTTTG 
GTAACTTCGTTTTCATGATTAAATTCTTTATTTGGTCGTATGTGATTGGAGTCTTCTCGG 
CATGGAACTTGACTTTGGTTTTAGGGTACTAGTCTCTACAGAAGCTGTGGTCCTTCTTTG 
GATGC 

>G1227 Amino Acid Sequence (domain in AA coordinates: 183-244) 
MERSIQGQNKLCCLDQKVNVRRSLQVQETVEDHQSFALEEEEQQLSTPSLLQDTTIPFLQ 
MLQQSEDPSPFLSFKDPSFLALLSLQTLEKPWELENYLPHEVPEFHSPIHSETNHYYHNP 
SLEGVNEAISNQELPFNPLENARSRRKRKNNNI^^ 

MTHIAVERNRRRQMNVHLNSLRS 1 1 PSS YIQRGDQAS I VGGAIDFVKILEQQLQSLEAQK 
RSQQSDDNKEQIPEDNSLRNISSNKLRASNKEEQSSKLKIEATVIESHVNLKIQCTRKQG 
QLLRSIILLEKLRFTVLHLNITSPTNTSVSYSFNLKMEDECNLGSADEITAAIRQIFDS* 
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>G2417 (118. .1311) 

CATACCGGTGGAAGATTCTGCTTTACTACGCTCTCCGCTTCTTCTTCTCCTCGATTCGAT 
TCTCCTCATGGGTTTATCATGAATTTTTAGGTTTTGAGTAATTCAGAAACTCGAGTGATG 
ATCCCGAATGATGATGATGATGCAAATTCTATGAAGAATTATCCGTTAAATGATGATGAT 
GCAAATTCTATGAAGAATTATCCGTTAAATGATGATGATGCAAATTCTATGGAGAATTAT 
CCGTTAAGGTCAATTCCGACGGAGCTTTCACACACTTGTTCATTGATACCACCTTCTTTA 
CCAAACCCTTCAGAAGCAGCAGCAGACATGTCCTTGAATTCAGAACTCAATCAAATCATG 
GCAAGGCCTTGTGATATGCTCCCTGCCT^TGGTGGAGCTGTTGGTGATAACCCTTTTTTG 
GAACCAGGATTCAACTGCCCCGAGACAACAGATTGGATTCCCTCTCCACTCCCCCATATT 
TATTTTCCTTCGGGTTCTCCC7UITCTAATAATGGAGGATGGTGTCATTGATGAGATTCAC 
AAACAAAGTGACTTGCCACTTTGGTATGACGACTTGATTACCACTGATGAAGATCCACTC 
ATGTCTAGTATCTTGGGCGATCTTCTCCTTGACACTAATTTCAACTCAGCTTCAAAGGTG 
CAGCAACCAAGTATGCAATCGCAGATTCAACAACCCCAAGCTGTTCTGCAGCAGCCTTCT 
TCTTGTGTGGAATTGCGCCCACTTGATAGGACAGTATCCTCAAACAGCAACAACAATAGC 
AACAGTAATAATGCAGCAGCAGCAGCTAAGGGACGTATGCGTTGGACGCCTGAACTTCAT 
GAGGTTTTTGTTGACGCTGTTAACCAGCTCGGTGGCAGTAATGAAGCAACTCCTAAAGGT 
GTCCTGAAGCATATGAAAGTCGAAGGTTTGACTATTTTTCATGTCAAAAGTCATTTGCAG 
AAATATAGAACAGCTAAATATATACCAGTACCATCAGAAGGTTCGCCGGAGGCAAGGTTG 
ACACCGCTTGAGCAAATTACATCTGATGATACGAAACGTGGGATAGATATCACTGAGACT 
CTGCGAATTCAGATGGAACATCAGAAGAAACTGCATGAGCAGCTTGAGAGTCTAAGAACA 
ATGCAACTTCGGATAGAAGAGCAAGGAAAGGCGCTGTTGATGATGATTGAGAAGCAAAAT 
ATGGGTTTCGGCGGACCAGAACAAGGAGAGAAAACAAGTGCGAAAACGCCTGAAAATGGT 
TCAGAGGAGTCGGAATCCCCGCGGCCAAAGCGTCCGAGAAATGAAGAATGAAGGAAACCT 
TTCTTCGGATGGTAGATCATAAAACTGTGGTTTTGGTGGAGTTGTAGAGTATGACTTATT 
AGGAGTAGAGCTTTCAGTCTTCTTCAGGC 

>G2417 Amino Acid Sequence (domain in AA coordinates: 235-285) 

MI PNDDDDANS MKNYPLNDDDANSMKNYPLNDDDANSMENYPLRS I PTELSHTCSLI PP S 

IiPNPSEAAADMSFNSELNQIMARPCDMLPANGGAVGHNPFLEPGFNCPETTDWIPSPLPH 

IYFPSGSPNLIMEIX3VTDEIHKQSDLPLWYDDLITTDEDPLMSSILGDLLLDTNFNSASK 

VQQPSMQSQIQQPQAVLQQPSSC^LRPLDRTVSSNSITONSNSNNAAAAAKGRMRWTPEL 

HEVFVDAWQLGGSNEATPKGVLKHMK^GLTIFHVKSHLQKYRTAKYIPVPSEGSPEA^ 

LTPLEQITSDDTKRGIDITETLRIQMEHQKKLHEQLESLRTMQLRIEEQGKALLMMIEKQ 

NMGFGGPEQGEKTSAKTPENGSEESESPRPKRPRNEE* 

>G2116 (104.. 1117) 

TTCATCTCCATCATTATCTCCATTGACATTGTTCTCAATTGCGAATAATAATCATAATTA 
TTCACACAACCAAAGCATTCATCTCTCAGATTCTCTTAAAAAAATGGAGAAATCAGATCC 
TCCACCAGTCCCAAAGCCCGGCGCCACTATTATCCCCTCCTCCGATCCAATTCCTAATGC 
CGATCCGATTCCATCTTCTTCCTTCCACCGCCGATCTCGCTCCGACGATATGTCCATGTT 
CATGTTCATGGATCCCCTCTCCTCCGCCGCACCACCTTCCTCCGACGACCTTCCCTCCGA 
CGACGATCTCTTCTCTTCTTTCATCGATGTCGATAGCCTCACCTCTAATCCCAATCCCTT 
TCAAAATCGTTCCCTCTCCTCCAACTCCGTTTCCGGCGCTGCTAATCCTCCTCCTCCTCC 
TTCCTCTCGTCCTCGCCACCGTCACAGCAATTCCGTTGACGCTGGATGCGCCATGTATGC 
CGGTGATATCATGGACGCTAAGAAAGCTATGCCTCCTGAAAAACTCTCTGAGCTTTGGAA 
CATCGATCCCAAACGCGCCAAAAGGATTCTAGCGAATCGACAATCTGCAGCTCGATCCAA 
AGAGAGAAAAGCTCGATACATTCAAGAACTTGAGCGCAAAGTTCAATCTCTTCAAACCGA 
AGCTACCACTCTCTCTGCTCAGCTTACTCTCTACCAGAGAGACACAAATGGACTAGCAAA 
CGAAAACACAGAGCTGAAACTTAGGTTGCAAGCAATGGAACAACAAGCTCAGCTTCGTAA 
TGCTTTAAACGAAG€GTTGAGGAAAGAAGTTGAAAGGATGAAGATGGAGACAGGAGAAAT 
CTCTGGTAATTCAGATTCGTTTGATATGGGAATGCAGCAGATTCAGTATTCTTCCTCAAC 
TTTCATGGCTATTCCACCATATCATGGCTCAATGAACCTCCATGATATGCAGATGCATTC 
TAGTTTCAATCCTATGGAGATGTCCAATTCTCAAAGCGTGTCGGACTTTCTACAG^CGG 
CCGAATGC^GGGCTGGAGATTAGTAGCAATAGCTCAAGCTTAGTCAAATCTGAAGGACC 
TTCTCTCTCTGCTAGTGAGAGTAGCTCTGCCTATTGACGACAAGATTATGATGAGGCTCA 
TTTTTCTG 

>G2116 Amino Acid Sequence (conserved domain in AA coordinates : 150-210) 

MEKSDPPPVPKPGATIIPSSDPIPNADPIPSSSFHRRSRSDDMSMFMFMDPLSSAAPPSS 

DDLPSDDDLFSSFIDVDSLTSNPNPFQNPSLSSNSVSGAANPPPPPSSRPRHRHSNSVDA 
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GCAMYAGDIMDAKKAMPPEKLSELWNIDPKRAKRILANRQSAARSKERKARYIQELERKV 

QSLQTEATTLSAQLTLYQRDTNGLANENTELKLRLQAMEQQAQLRNALNEALR 

METGEISGNSDSFDMGMQQIQYSSSTFMAIPPYHGSMNLHDMQMHSSFNPMEMSNSQSVS 

DFLQNGRMQGLEISSNSSSLVKSEGPSLSASESSSAY* 

>G647 (1. .948) 

ATGATGATCGGCGAAAATAAAAACCGGCCACATCCAACGATCCATATCCCTCAATGGGAT 

CAAATCAACGATCCAACGGCCACAATCTCTTCACCATTCTCTTCCGTCAACCTTAACAGC 

GTTAACGACTACCCACACTCTCCGTCACCGTATCTCGACTCCTTCGCTTCTCTCTTCCGT 

TACCTCCCGTCAAACGAGTTAACAAACGATTCAGACTCATCAAGTGGCGACGAGTCATCA 

CCACTCACCGACTCATTCTCCTCCGACGAGTTTCGCATCTACGAGTTCAAAATCCGGCGA 

TGCGCTCGAGGTCGATCTCATGATTGGACGGAGTGTCCGTTCGCACATCCCGGAGAAAAA 

GCTCGACGACGTGATCCGAGAAAGTTTCATTACTCCGGCACCGCTTGTCCTGAGTTTCGT 

AAAGGAAGTTGTAGAAGAGGTGATTCGTGTGAGTTCTCTCATGGAGTTTTCGAGTGTTGG 

CTCCATCCTTCTCGTTACCGTACTCAGCCGTGTAAAGACGGAACTAGCTGCCGGAGAAGA 

ATCTGTTTCTTCGCTCATACGACGGAGCAGTTACGTGTATTACCTTGTTCGTTAGATCCA 

GATCTTGGATTCTTCTCAGGATTAGCTACTTCTCCGACTTCX3ATTCTTGTTTCTCCTTCG 

TTTTC^CCACCGTCGGAATCTCCGCCGCTTTCTCCGAGTACCGGTGAACTTATTGCGTCG 

ATGAGGAAAATGCAATTGAACGGAGGTGGTTGTTCGTGGAGTTCTCCGATGAGATCTGCA 

GTTAGGTTACCTTTTTCGTCGTCTCTGCGTCCGATTCAGGCGGCAACGTGGCCGAGGATA 

AGAGAGTTTGAGATCGAAGAAGCTCCGGCGATGGAATTTGTGGAATCTGGGAAAGAGCTG 

AGAGCGGAGATGTATGCAAGACTCAGTAGAGAGAACTCACTCGGTTGA 

>G647 Amino Acid Sequence (domain in aa coordinates: 77-192) 

MMIGENKNRPHPTIHIPQWDQINDPTATISSPFSSVl^SVNDYPHSPSPYLDSFASLF^ 

YLPSNELTNDSDSSSGDESSPLTDSFSSDEFRIYEFKIRRCARGRSHDWTECPFAHPGEK 

ARRRDPRKFHYSGTACPEFRKGSCRRGDSCEFSHGVFECWLHPSRYRTQPCKDGTSCRRR 

ICFFAHTTEQLRVLPCSLDPDLGFFSGIxATSPTSILVSPSFSPPSESPPLSPSTGELIAS 

MRKMQLNGGGCSWSSPMRSAVRLPFSSSLRPIQAATWPRIREFEIEEAPAMEFVESGKEL 

RAEMYARLSRENSLG* 

>G974 (377.. 1162) 

AAAAAAAAAGTTGATATACTTTCTGGTTTTCTCCTTAACTTTTATTCTTTACAAATCCAT 
CCCCCTTAGATCTGTTTATTTCCCGCTACTTTGATTCATTTCTGTTAGTAATCTGTCTTT 
CGTATAGAAGAAAACTGATTTCTTGGTTTGTATTTTCTTAAAGAGATCAATCTTTTTTTA 

TTAATTCCCTCCTCTCAGAAATCTACACAGAGGTTTTTTATTTTATAAACCTCTTTTTCG 
ATTTTCTTGAAAACAAAAAATCCTGTTCTTTACTTTTTTTACAAGAACAAGGGAAAAAAA 
TTTCTTTTTATTAGAAATGACAACTTCTATGGAT^ 

ATCTGATCCATTCGGTGGTGAATTAATGGAAGCGCTTTTACCTTTTATCAAAAGCCCTTC 

CAACGATTCATCCGCGTTTGCGTTCTCTCTACCCGCTCCAATTTCATACGGGTCGGATCT 

CCACTCATTTTCTCACCATCTTAGTCCTAAACCGGTCTCAATGAAACAAACCGGTACTTC 

CGCGGCTAAACCGACGAAGCTATACAGAGGAGTGAGAC/^ACGTCACTGGGGAAAATGGGT 

GGCTGAGATTCGTTTACCGAGGAATCGAACTCGACTTTGGCTCGGAACATTCGACACGGC 

GGAGGAAGCTGCTTTAGCTTATGACAAGGCGGCGTATAAGCTCCGAGGAGATTTTGCGCG 

GCTTAATTTCCCTGATCTCCGTCATAACGACGAGTATCAACCTCTTCAATCATCAGTCGA 

CGCTAAGCTTGAAGCTATTTGTCAAAACTTAGCTGAGACGACGCAGAAACAGGTGAGATC 

AACGAAGAAGTCTTCTTCTCGGAAACGTTCATCAACCGTCGCAGTGAAACTACCGGAGGA 

GGACTACTCTAGCGCCGGATCTTCGCCGCTGTTAACGGAGAGTTATGGATCTGGTGGATC 

TTCTTCGCCGTTGTCGGAGCTGACGTTTGGTGATACGGAGGAGGAGATTCAGCCGCCGTG 

GAACGAGAACGCGTTGGAGAAGTATCCGTCGTACGAGATCGATTGGGATTCGATTCTTCA 

GTGTTCGAGTCTTGTAAATTAGATGTTGCCATAGGGGTATTTTAGGGACTTTAGAGCTCT 

CTGCGATGGAGTTTTTGGTCATTGCAGAGATTTTATTATTATTAAGGGGGTTTGTTATGT 

TAATATCAAATAAGTTTATCTACTTTGATGTTAATTAGTGTTAATCTCTGCGTCGGTCCA 

AGCTGTTTTTTTTTGGCATGCTTCGACCGTGTGAGATTTCTTATGTAATTTTTGTAGTTC 

CTTGATTTTCTTAGTTCAAGTTAAATTGGCACAAAAAAAAAAAAAAAAAA 

>G974 Amino Acid Sequence (domain in AA coordinates: 81-140) 

MTTSMDFYSNKTFQQSDPFGGELMEALLPFIKSPSNDSSAFAFSLPAPISYGSDLHSFSH 

HLSPKPVSMKQTGTSAAKPTKLYRGWQRHWGKWVAEIRLPRNRTRLWLGTFDTAEEAAL 

AYDKAAYKLRGD FARLNFPDLRHNDEYQPLQS SVDAKLEAICQNIiAETTQKQVRSTKKSS 
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SRKRSSTVAVKLPEEDYSSAGSSPLLTESYGSGGSSSPLSELTFGDTEEEIQPPWNENAL 

EKYPSYEIDWDSILQCSSLVN* 

>G1419 (27.. 692) 

GAAGACTCCAACATAATTCATCATCTATGGCTTCTTCACATCAACAACAGCAAGAACAAG 

ACCAGTCAGCTTTAGATCTCATAACCCAACACCTTCTTACTGATTTCCCTTCCTTAGACA 

CCTTTGCCTCCACCATCCACCACTGCACCACCTCAACTCTAAGCCAACGCAAACCACCTC 

TTGCCACTATAGCAGTTCCTACTACTGCACCGGTGGTTCAAGAGAATGATCAAAGGCATT 

ACAGAGGCGTCAGGAGAAGACCATGGGGTAAGTATGCGGCTGAGATCAGAGACCCAAACA 

AGAAAGGTGTTCGTGTCTGGTTAGGCACTTTTGAC^CAGCCATGGAAGCTGCAAGAGGTT 

ATGACAAGGCAGCTTTTAAACTACGAGGAAGCAAAGCTATTCTTAACTTCCCACTTGAAG 

CAGGAAAGCATGAGGACTTGGGAGACAACAAGAAGACTATTTCTTTAAAAGCAAAGAGGA 

AGAGACAGGTGACGGAGGATGAAAGCCAGCTGATCAGCCGTAAAGCTGTTAAGAGGGAAG 

AAGCTCAGGTTCT^GGCTGATGCTTGTCCATTAACGCCATCAAGTTGGAAGGGGTTTTGGG 

ACGGAGCAGACAGTAAAGACATGGGAATATTTTCCGTGCCTCTGTTATCTCCITGTCC^ 

CTCTTGGACACTCTCAACTCGTAGTTACTTAAGCTTCAGAGGGTCAAACTGGAAAAAATC 

AACATTGGATTGTTTTCAAAGCTTCTAGATTAGCTGATTGTAAAAAAATGTTTTACTATA 

TTCATTCATTCTTCTTAAATGCAATTCTTTCTACCCTTCC 

>G1419 Amino Acid Sequence (domain in AA coordinates: 69-137) 

MASSHQQQQEQDQSALDLITQHLLTDFPSI^TFASTIHHCTTSTLSQRKPPLATIAVPTT 

APWQENDQRHYRGVRRRPWGKYAAE IRDPNKKGVRWLGTFDTAME AARGYDKAAFKLR 

GSKAILNFPLEAGKHEDLGDNKKTISLKAKRKRQVTEDESQLISRKAVKREEAQVQADAC 

PLTPSSWKGFWDGADSKDMGIFSVPLLSPCPSLGHSQLWT* 

>G1634 (22.. 855) 

TTATCTCGTAGCCTTTAAACGATGGAGACTCTGCATCCACTACTCTCGCACGTGCCAACT 

TCTGACCACCGGTTTGTAGTTCAAGAGATGATGTGCTTGCAAAGCTCGAGCTGGACTAAA 

GAAGAGAACAAGAAGTTTGAGCGAGCTCTTGCTGTCTACGCTGATGACACGCCTGATCGC 

TGGTTCAAAGTTGCTGCTATGATCCCTGGAAAGACCATATCAGATGTCATGAGGCAATAC 

TCTAAGCTTGAAGAAGACCTCTTCGATATCGAAGCAGGACTTGTCCCGATCCCGGGTTAC 

CGTTCAGTTACTCCTTGTGGATTTGATCAGGTTGTGAGTCCACGTGACTTTGATGCGTAT 

CGTAAACTTCCTAATGGAGCCAGAGGATTTGATCAAGACCGTAGGAAAGGAGTTCCATGG 

ACGGAGGAAGAACACAGGAGATTCTTGTTAGGGCTTCTCAAGTATGGGAAAGGAGATTGG 

AGAAACATATCGAGGAACTTTGTGGGATCAAAAACACCAACTCAGGTTGCAAGTCATGCC 

CAAAAGTACTACCAAAGACAGCTTTCCGGTGCGAAAGACAAACGACGGCCTAGCATTCAC 

GACATCACCACCGTCAATCTTCTCAATGCCAATCTTAGCCGTCCATCGTCTGATCACGGT 

TGCTTAGTCTCAAAACAGGCCGAGCCGAAACTAGGGTTCACCGACAGGGATAATGCAGAG 

GAGGGAGTTATGXTTCTTGGTCAGAATCTATCCTCGGTCTTCTCTTCCTACGATCCTGCC 

ATTAAGTTTTCCGGAGCAAATGTTTACGGTGAAGGAGGTTACTGTATCTCACAAGATCTT 

GAAACGAGAAAATGAGAATTTTGAAATTTTAACTATTGCAACGAAACCATAATTGC 

>G1634 Amino Acid Sequence '(domain in AA coordinates: 129-180) 

METLHPLLSHVPTSDHRFWQEMMCLQSSSWTKEENKKFERALAV^ 

IPGKTISDVMRQYSKLEEDLFDIEAGLVPIPGYRSVTPCGFDQVVSPRDFDAYRKLPNGA 

RGFDQDRRKGVPWTEEEHI^FLLGLLKYGKGDWI^ISRNFVGSKTPTQVASHAQKYYQRQ 

LSGAKDKRRPSIHDITTVNLLNANLSRPSSDHGCLVSKQAEPKLGFTDRDNAEEGVMFLG 

QNLSSVFSSYDPAIKFSGANVYGEGGYCISQDLETRK* 

>G1637 (1..954) 

ATGGTGAAGGAGACGGTGACGGTGGCGAAAACGTGCTCACACTGTGGCCATAATGGCCAT 
AACGCACGGACTTGTCTCAACGGCGTTAATAAGGCAAGTGTTAAACTGTTCGGCGTTAAT 
ATATCGTCTGATCGGATTAGGCCGCCTGAGGTAACGGCGTTAAGGAAGAGTCTTAGTTTG 
GGAAACCTTGATGCTCTTCTCGCTAACGATGAAAGTAACGGTAGCGGTGATCCTATCGCC 
GCCGTTGATGATACCGGTTATCATTCCGATGGTCAGATTCATTCCAAGAAGGGTAAAACT 
GCTCATGAGAAGAAAAAGGGGAAGCCATGGACGGAAGAAGAACATCGTAATTTCTTAATC 
GGTTTAAACAAACTCGGAAAAGGAGATTGGAGAGGCATTGCAAAGAGTTTCGTGTCGACA 
AGAACACCAACACAAGTCGCAAGTCATGCTCAGAAATATTTTATTAGGTTAAACGTTAAC 
GACAAGAGAAAAAGACGTGCTAGTCTCTTTGACATCTCTCTCGAAGATCAGAAGGAGAAA 
GAGAGGAACTCTCAAGATGCTTCAACAAAGACTCCACCTAAACAACCAATAACCGGAATT 
CAACAACCGGTAGTACAAGGTCATACTCAAACCGAGATTTCGAACAGGTTTCAGAATTTA 
TCAATGGAGTATATGCCAATCTACCAACCCATACCACCTTACTACAACTTTCCACCTATT 
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ATGTACCATCCAAATTATCCAATGTACTATGCCAACCCTCAAGTACCGGTTAGGTTTGTT 

CATCCTTCTGGTATACCTGTTCCAAGACATATACCGATTGGTTTGCCTCTGTCTCAACCG 

AGTGAAGCTTCTAATATGACAAATAAAGACGGTTTGGATCTTCATATCGGTTTGCCTCCA 

CAAGCTACTGGAGCTTCTGACTTGACTGGTCATGGCGTTATTCATGTGAAATGA 

>G1637 Amino Aoid Sequence (domain in AA coordinates: 109-173) 

MVKETVTVAKTCSHCGHNGHNARTCI fl NGVNKAS VKL FGVN ISSDPI RPPEVTALRKS LSL 

GNLDALIJ^ESNGSGDPIAAVDDTGYHSDGQIHSKKGKTAHEKKKGKPWTEEEHRNFLI 

GLNKLGKGDWRGIAKSFVSTRTPTQVASHAQKYFIRLNVNDK^^ 

ERNSQDASTKTPPKQPITGIQQPWQGHTQTEISNRFQNLSMEYMPIYQPIPPYYNFPPI 
MYHPNYPMYYANPQVPWFVHPSGIPVTRHIPIGLPLSQPSEASNMTNKDGLDLHIGLPP 
QATGASDLTGHGVIHVK* 
>G1818 {601.. 1161) 

TAACAAATCAAATAATTAGAGAAATAACCAAAATTTAACTTTTAGAGGGACTACAGGATT 
TGTACTTTGTACATTCATATATTATTGTTATATA^ 

TGTAAATTAAGTAAAATTCAATTTAACATCATGAGGAAATTCTTATTAAAATTCTCTTAA 

AATTTTGAG CAAATTATG CTTTCAC ATTTAAC ATTTG AAAAC ATC ATTTTTAACAAGATA 

TTCAAAACTAAGTTTTGTACAGCAAAATTTTAACTTTCAATTTTATAGAGAAAAAGGTAT 

TTTTTTTTTTGTTTCATTl w I M rATAAGACTATTATTTGGTATATAATATACACTTTAAGTA 

AAAACAAATCTCTTTCTTTTTTCTTCTTATAATACCAACCACAAGTCTGTCAGTCACACA 

CATACAGTTAATAACATTAAATATTCTTAACAAAGTACTAAATAGGTTGAGATTCATATA 

TGTAAAGAGATCACTTCTTAATCTTATCCTACCATATCTTATATACGCTTAATTTTCCTT 

TATATATGCAAACCTCCACATAAAAATATCTCAAACCCAAACACTTCAAACAAAAAAAAA 

ATGGAGAACAACAACAACAACCACCAACAGCCACCGAAAGATAACGAGCAACTAAAGAGT 

TTCTGGTCAAAGGGGATGGAAGGTGACTTGAATGTCT^AGAATCACGAGTTCCCCATCTCT 

CGTATCAAGAGGATAATGAAGTTTGATCCGGATGTGAGTATGATCGCTGCTGAGGCTCCA 

AATCTCTTATCTAAGGCTTGTGAAATGTTTGTCATGGACCTCACGATGCGTTCATGGCTC 

CATGCTCAAGAGAGCAACCGACTCACGATACGGAAATCTGATGTTGATGCCGTAGTGTCT 

CAAACCGTCATCTTTGATTTCTTGCGTGATGATGTCCCTAAGGACGAGGGAGAGCCCGTT 

GTCGCCGCTGCTGATCCTGTGGACGATGTTGCTGATCATGTGGCTGTGCCAGATCTTAAC • 

AATGAAGAACTGCCGCCGGGAACGGTGATAGGAACTCCGGTTTGTTACGGTTTAGGAATA 

CACGCGCCACACCCGCAGATGCCTGGAGCTTGGACCGAGGAGGATGCGACTGGGGCAAAT 

GGAGGAAACGGTGGGAATTAATATTTGGATTGGGTTTTGTAACCGCTGTTGTGAGAACTT 

GAATTTCTTTTTGAGTTCTGCTTATGTTTTCAATGTTATGTTTTTTAGTTGTTGAATGTA 

TTTCTGTTGTTTTGTCCAAAAAAAAAAAAGAATGTATTTCTGTTGTTGTCITT 

ATCTAATGGTTTATGAATATTGGCTTTAGATTAATTTATGCATACAAAAACACAAGGATT 

ACGGATAAAAAAGTCCTCAGTTTACCCATGGAAACATAATCTTCTAGTGATTCCTTATGA 

GAGTAGAAAAGAATCATATATTATAATCTATTTCATAAGAGATAGGGTACTGTAAACAAG 

GATGTTTATTCGGCTATTTCTTTTTTTTTTAA 

GTTTGCAGCTTTTTGTTAGATTACATTCTAGAGGCAACAAGATCCAGAGATCTAGCAAAA 

AAAACTTATTTTGAAACCTGAATCTATTTTAAAAATTTTCCAACTCATTTTTCGTTCTTA 

TTCTTTGTTTTCCAACGGAATTTGGCGCACAAACGATTTATTTGAATTTTC 

>G1818 Amino Acid Sequence (domain in AA coordinates: 36-113) 

MENNNNiraQQPPKIW^ 

l^LSKACEMFVMDLTMRSWLHAQESNRM^ 

VAAADPVDDVADHVAVPDLNNEELPPGWIGTPVCYGLGIHAPHPQMPGAWTEEDATGAN 
GGNGGN* 

>G1820 (1..609) 

ATGGCTGAGAACAA6AACAACAACGGCGACAACATGAACAACGACAACCACCAGCAACCA 
CCGTCGTACTCGCAGCTGCCGCCGATGGCATCATCCAACCCTCAGTTACGTAATTACTGG 
ATTGAGCAGATGGAAACCGTCTCGGATTTCAAAAACCGTCAGCTTCCATTGGCTCGAATT 
AAGAAGATCATGAAGGCTGATCCAGATGTGCACATGGTCTCCGCAGAGGCTCCGATCATC 
TTCGCAAAGGCTTGCGAAATGTTCATCGTTGATCTCACGATGCGGTCGTGGCTCAAAGCC 
GAGGAGAACAAACGCCACACGCTTCAGAAATCGGATATCTCCAACGCAGTGGCTAGCTCT 
TTCACCTACGATTTCCTTCTTGATGTTGTCCCTAAGGACGAGTCTATCGCCACCGCTGAT 
CCTGGCTTTGTGGCTATGCCACATCCTGACGGTGGAGGAGTACCGCAATATTATTATCCA 
CCGGGAGTGGTGATGGGAACTCCTATGGTTGGTAGTGGAATGTACGCGCCATCGCAGGCG 
TGGCCAGCAGCGGCTGGTGACGGGGAGGATGATGCTGAGGATAATGGAGGAAACGGCGGC 
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GGAAATTGA 

>G1820 Amino Acid Sequence (domain in AA coordinates: 70-133) 
MAENNNIWGDNMN^ 

KKIMKADPDVHMVSAEAPI I FAKACEMFI VDLTMRSWLKAEENKRHTLQKSDISNAVASS 
FTYDFLLDWPKDESIATADPGFVAMPHPDGGGVPQYYYPPGWMGTPMVGSGMYAPSQA 
WPAAAGDGEDDAEDNGGNGGGN* 
>G1903 (1..1200) 

ATGTCTAAATCTAGAGATACGGAGATAAAGTTGTTTGGGAGGACAATCACATCTCTTTTA 
GATGTGAATTGTTATGATCCGTCGTCGTTGTCCCCTGTTCACGATGTTTCTTCTGATCCA 
AGCAAGGAGGATTCGTCTTCTTCTTCATCTTCTTGTTCTCCAACTATTGGACCAATCAGG 
GTTCCGGTTAAAAAAAGTGAGCAAGAGAGTAACAAATTCAAAGATCCATATATATTATCC 
GATCTAAACGAACCACCAAAAGCAGTATCTGAGATTTCATCACCAAGAAGTTCCAAGAAC 
AACTGTGATCAACAGAGCGAGATCACAACAACAACTACCACAAGTACTACATCAGGAGAG 
AAATCAACGGCTCTCAAGAAACCGGACAAGCTTATTCCATGTCCTAGATGTGAAAGCGCA 
AACACCAAATTCTGTTATTACAAO^CTACAACGTGAACCAGCCACGTTACTTCTGCAGG 
AACTGTC^GAGGTATTGGACAGCTGGTGGATCTATGAGGAACGTTCCTGTTGGCTCAGGT 
CGTCGCAAGAACAAAGGATGGCCTTCTTCAAACCATTACTTGCAAGTCACTTCTGAGGAT 
- TGTGATAATAATAACTCGGGGACGATCCTTAGTTTCGGTTCTTCGGAGTCTTCGGTTACA 
GAGACTGGTAAGCATCAGTCAGGTGATACAGCAAAGATAAGTGCTCATTCAGTTTCTCAA 
GAAAATAAAAGCTACCAAGGGTTTCTTCCTCCGCAAGTAATGTTACCTAATAATTCTTCT 
CCTTGGCCTTACCAATGGAGTCCAACGGGTCCTAACGCTAGTTTCTACCCTGTCCCCTTC 
TACTGGGGATGCACGGTTCCGATATACCCTACCTCAGAGACTTCATCATGTTTAGGAAAA 
CGGTCAAGAGATCAAACTGAAGGAAGAATCAATGATACTAATACAACAATAACTACTACA 
AGAGCAAGATTGGTCTCAGAATCTCTTAGAATGAATATCGAAGCTAGTAAGAGCGCTGTG 
TGGTCTAAGTTACCGACAAAACCCGAGAAAAAAACGCAAGGATTCAGTTTGTTCAATGGA 
TTTGACACAAAGGGAAACAGCAACAGAAGTAGCTTGGTCTCCGAAACTTCTCACAGTCTA 
CAAGCAAACCCTGCAGCGATGTCTAGAGCTATGAACTTCAGGGAGAGCATGCAACAATAA 
>G1903 Amino Acid Sequence (domain in AA coordinates: 134-180) 
MSKSRDTEIKLFGRTITSLLDVNCYDPSSLSPVHDVSSDPSKEDSSSSSSSCSPTIGPIR 
VPVKKSEQESNKFKDPYILSDLNEPPKAVSEXSSPRSSKNNCDQQSEITTTTTTSTTSGE 
K£TALK20>DKLIPCPRCESANTKFCYYiram^ 

RRKNKGWP S SNH YLQVTSEDCDNNNSGTI LS FGS S ES S VTETGKHQSGDTAKI S ADS VS Q 

ENKSYQGFLPPQVMLPNNSSPWPYQWSPTGPNASFYPVPFYWGCTVPIYPTSETSSCLGK 

RSRDQTEGRINDTNTTITTTRARLVSESLRMNIEASKSAVWSKLPTKPEKKTQGFSLFNG 

FDTKGNSNRSSLVSETSHSLQANPAAMSRAMNFRESMQQ* 

>G371 (1..582) 

ATGGAGATTGAGAAGGATGAGGACGACACAACATTGGTTGATTCTGGAGGAGACTTCGAC 
TGCAACATATGTTTGGATCAGGTTCGAGACCCGGTCGTGACTTTATGTGGCCACCTGTTT 
TGTTGGCCCTGCATTCACAAGTGGACTTATGCGTCCAACAATTCAAGACAACGAGTCGAT 
CAATACGATCATAAGAGGGAACCACC^AAATGTCCGGTATGCAAATCTGATGTCTCCGAG 
GCTACGCTTGTCCCGATCTACGGACGAGGACAGAAAGCTCCCCAGTCCGGTTCAAATGTA 
CCGAGCAGACCAACTGGTCCGGTTTATGACTTAAGAGGAGTTGGTCAACGTTTAGGAGAA 
GGGGAGAGTCAACGTTACATGTATAGAATGCCTGATCCGGTGATGGGTGTGGTATGCGAA 
ATGGTATACCGGAGACTATTTGGAGAGTCTTCGAGCAACATGGCACCTTACCGCGATATG 
AATGTCCGGTCTAGGCGACGGGCAATGCAGGCTGAGGAGTCATTAAGCAGAGTCTACTTG 
TTTCTACTTTGCTTCATGTTTATGTGTCTATTTCTCTTCTAA 

>G371 Amino Acid Sequence (domain in aa coordinates: 21-74) 

MEIEKDEDDTTLVDSGGDFDCNICLDQVRDPWTLCGHLFCWPCIHKWTYASNNSRQRVD 

QYDHKREPPKCPVCKSDVSEATLVPIYGRGQKAPQSGSNVPSRPTGPVYDLRGVGQRLGE 

GESQRYMYRMPDPVMGWCEMVYRRLFGESSSNMAPYRDMNVRSRRR^ 

FLLCFMFMCLFLF* 

>G597 (255.. 1310) 

AAAATTCTCCTGTAAAATTTAATATTATAAAAGTGGTTTCTTTTTCATTTATGTTTATAT 
AATTTTCATCTTTAATCTTAAATTCTGGTAACCTTAATGCGCGATCCGCTTTTCTAAAGT 
TTTGTGAGAGAGAAGAGATCTAAAAAAATCCACAATTTTGTTCAAATCTTGGAGTTAAAT 
GCTGAATTTTAGGCCTTGTTGCTTAGATTTATGGCTTAAAGTTTCAAACTTTTCATTGGA 
TATGTGAGAAGAAAATGTCAGGATCTGAGACGGGTTTAATGGCGGCGACCAGAGAATCAA 
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TGCAATTTACAATGGCTCTCCACCAGCAGCAGCAACACAGTCAAGCTCAACCTCAGCAGT 

CTCAGAACAGGCCATTGTCATTCGGTGGAGACGACGGAACTGCTCTTTACAAGCAGCCGA 

TG AGATCAGTATCACCACCGCAGCAGTACCAAC CCAACTCAG CTGGTGAGAATTCTGTCT 

TGAACATGAACTTGCCCGGAGGTGAGTCTGGAGGCATGACTGGAACTGGAAGTGAGCCAG 

TGAAAAAGAGGAGAGGTAGACCGAGGAAATATGGGCCTGATAGTGGTGAAATGTCACTTG 

GTTTGAATCCTGGAGCTCCTTCTTTCACTGTCAGCCAACCTAGTAGCGGCGGCGATGGAG 

GAGAGAAGAAGAGAGGAAGACCTCCTGGTTCTTCTAGCAAAAGGCTCAAGCTTCAAGCTT 

TAGGCTCGACTGGAATCGGATTTACGCCTCATGTACTTACCGTGCTGGCTGGAGAGGATG 

TATCATCCAAGATAATGGCGTTAACTCATAATGGACCCCGTGCTGTGTGTGTCTTGTCTG 

CAAATGGAGCCATCTCCAATGTGACTCTCCGCCAGTCTGCCACATCCGGTGGAACTGTTA 

CATATGAGGGGAGATTTGAGATTCTGTCTTTATCGGGATCTTTCCATTTGCTGGAGAACA 

ATGGTCAAAGAAGCAGGACGGGAGGTCTAAGCGTGTCATTATCAAGTCCGGATGGTAATG 

TCCTCGGTGGCAGTGTAGCTGGTCTTCTTATAGCAGCATCACCTGTTCAGATTGTTGTTG 

GGAGTTTCTTACCAGACGGAGAAAAAGAACCAAAACAGCATGTGGGACAAATGGGACTGT 

CGTCACCCGTATTACCGCGTGTGGCCCCAACGCAGGTGCTGATGACTCCAAGTAGCCCAC 

AATCTCGAGGCACAATGAGTGAGTCATCTTGTGGAGGAGGACATGGAAGCCCTATTCATC 

AGAGCACTGGAGGACCTTACAATAACACCATTAACATGCCCTGGAAGTAGCCAAGTGATC 

TGTGTCGGCTTAAAACCAACAACTTCCCGTTATTAGAGTGATTTATTTCTACATTTGG 

TAGACITrCTAGTTCTGATGGTTATTC 

GACAAAAGGAGTTTGATAAATTGACCGACCTATTTTGTGTGTTTGAGGTACTTTCAGAAC 
CATAGGTGTTCAGAAATTAGAATGTTCTGTTTAAAAAA 

>G597 Amino Acid Sequence (domain in AA coordinates: 97-104,137-144) 
MSGSETGLMAATRESMQFTMALHQQQQHSQAQPQQSQNRPLSFGGDDGTALYKQPMRSVS 
PPQQYQPNSAGENSVLNMNLPGGESGGMTGTGSEPVKKRRGRPRKYGPDSGEMSLGLNPG 
APSFTVSQPSSGGDGGEKKRGRPPGSSSKRLKLQALGSTGIGFTPHVLTVIiAGEDVSSKI 
MALTHNGPRAVCVLSANGAI SNVTLRQSATSGGTVTYEGRFEI LSLSGSFHLLENNGQRS 
RTGGLS VSLS SPDGNVLGGS VAGLL IAAS PVQI WGS FLPDGEKE PKQHVGQMGLS S PVL 
PRVAPTQVLMTPSSPQSRGTMSESSCGGGHGSPIHQSTGGPYNNTINMPWK* 
>G1009 (28.. 1704) 

AAAAAAAAAAAAAACCTATTCCGAAAGATGAAGAACAATAACAACAAATC 

TCTAGCTATGATTCTTCTTTGTCTCCTTCTTCTTCATCCTCCTCCCACCAGAACT(3GCTC 

TCTTTCTCTCTCTCCAACAATAACAACAACTTCAATTCTTCCTCAAACCCTAATCTCACT 

TCCTCCAC^TCAGATCATCATC^TCCTCACCCTTCTCACCTCTCTCTCTTTCAAGCTTTC 

TCCACTTCTCCAGTCGAACGGCAAGATGGGTCACCGGGAGTTTCACCCAGCGATGCCACG 

GCGGTTCTTTCCGTATACCCCGGCGGTCCTAAACTTGAGAACTTCCTCGGCGGAGGAGCC 

TCAACGACGACAACAAGACCAATGCAACAAGTGCAATCTCTTGGCGGCGTTGTCTTCTCT 

TCCGACCTACAGCCACCGCTTCATCCTCCGTCCGCCGCCGAGATCTACGACTCTGAGCTC 

AAGTCAATAGCCGCTAGCTTCCTAGGAAACTACTCCGGTGGACACTCGTCGGAGGTCTCT 

AGCGTACATAAACAACAACCGAATCCTCTAGCTGTCTCAGAGGCTTCGCCTACTCCGAAG 

AAGAACGTAGAGAGTTTTGGACAACGTACCTCGATTTATAGAGGAGTCACAAGACATAGA 

TGGACTGGAAGATACGAAGCTCATCTATGGGATAATAGTTGCCGAAGAGAAGGCCAAAGC 

AGAAAAGGAAGACAAGTTTATTTAGGTGGTTATGATAAGGAAGATAAAGCAGCTAGAGCT 

TACGACCTTGCAGCTCTTAAGTATTGGGGTCCTACAACTACGACTAATTTCCCGATATCA 

AATTACGAATCTGAACTTGAAGAAATGAAACACATGACTCGACAAGAGTTCGTTGCTTCT 

TTAAGACGGAAAAGCAGTGGATTCTCTAGGGGTGCCTCCATGTACAGAGGCGTCACTAGA 

CATCATCAGCATGGTCGATGGCAGGCACGAATTGGAAGAGTTGCAGGCAACAAAGACCTT 

TATCTTGGCACATTTAGCACTCAAGAGGAAGCTGCAGAAGCTTATGATATAGCAGCGATC 

AAATTCCGCGGTCTAAATGCAGTCACCAATTTCGACATCAGTCGATATGATGTCAAATCA 

ATTGCTAGCTGTAATCTCCCTGTGGGTGGACTAATGCCTAAACCTTCTCCAGCAACCGCA 

GCGGCTGACAAAACCGTTGATCTTTCTCCATCCGACTCTCCATCTCTAACCACACCGTCC 

CTCACGTTCAATGTGGCAACACCGGTCAATGACCATGGAGGAACTTTTTACCACACTGGT 

ATACCAATCAAACCAGACCCGGCTGATCATTATTGGTCCAACATCTTTGGATTCCAGGCA 

AACCCGAAAGGAGAAATGCGACCATTAGCAAACTTTGGGTCGGATCTTCATAAGCCTTCT 

CCTGGTTATGCTATAATGCCGGTAATGCAGGAAGGTGAAAACAACTTTGGTGGTAGTTTT 

GTTGGGTCTGATGGGTATAACAATCATTCCGCTGCATCGAACCCGGTCTCAGCAATTCCG 

CTGTCCTCGACAACTACAATGAGTAACGGTAACGAAGGGTATGGTGGAAACATAAACTGG 

ATTAATAACAACATTTCAAGTTCTTACCAAACTGCAAAATCAAATCTCTCTGTTTTGCAC 
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ACACCGGTTTTTGGGTTGGAATGAGTATTCACATCTTAGTGAGAACTAAAATAAATATGT 
AGGAAAAAAATAAGGCTCTGTTTGAAGAAATCAGATATTTTCTTCTTAGATTATTTAAGT 
AGTTTAAAAAAAATATTTTTTAAGTGTTTCACT^ 

TTGCTGGATCTGACAGTACTAACTCTTTGTTTAATGACCTTATGGGTTCCTTT^ 
TCCAGAACTTTTATTTACTTTTTTCTTCATTTTTCTTCATTTTTTTTGTTGTGGG 
ATGAATGATTGAAGATGGAAACTGCTTGCATGTGAATAAACGAAAATCAAACWATCTTCG 
GTAACTTAAAAA 

>G1009 Amino Acid Sequence (domain in aa coordinates: 201-277, 303-371) 
MKNNNNKSSSSSSYDSSLSPSSSSSSHQNWLSFSLSNNNNNFNS 

HPSHLSLFQAFSTSPVERQDGSPGVSPSDATAVLSVYPGGPKLENFLGGGASTTTTRPMQ 
QVQSLGGWFSSDLQPPLHPPSAAEIYDSELKSIAASFLGNYSGGHSSEVSSVHKQQPNP 
LAVSEASPTPKKNVESFGQRTSIYRGVTRHRWTGRYEAHLWDNSCRREGQSRKGRQVYLG 
GYDKEDKAARAYDLAALKYWGPTTTTNFPISNY 

RGASMYRGVTRHHQHGRWQARIGRVAGNKDLYIiGTFSTQEEAAEAYDIAAIKFRGLNAVT 
NFDI SRYDVKS IAS CNLPVGGLMPKPS PATAAADKTVDLS PSDSPSLTTPSLTFNVATPV 
NDHGGTFYHTGIPIKPDPADHYWSNIFGFQA2STPKAEMRPLANFGSDLHNPSPGYAIMPVM 
QEGENNFGGS FVGSDGYNNHS AASNP VSAI PLS STTTMSNGNEG YGGNINWINNN ISSSY 
QTAKSNLS VLHTPVFGLE * 
>G170 (1..1107) 

ATGGGGATGAAGAAGGTGAAGCTATCTTTGATAGCTAATGAAAGATCAAGGAAAACATCC 
TTCATAAAGAGGAAAGACGGGATTTTTAAGAAACTCCACGAGTTGTCAACTCTGTGTGGT 
GTCCAAGCTTGTGCTCTCATCTACAGTCCATTCATACCGGTTCCAGAGTCATGGCCGTCA 
AGGGAAGGTGCTAAAAAGGTGGCTTCAAGGTTTCTGGAGATGCCGCCGACAGCCCGAACC 
AAGAAGATGATGGATC^GAGACTTACCTTATGGAGAGGATTACCAAAGCAAAAGAGCAA 
CTAAAGAACCTGGCTGCTGAGAACCGAGAGTTACAGGTTAGACGATTTATGTTTGATTGT 
GTTGAAGGCAAAATGTCCCAGTATCATTATGATGCAAAAGACCTTCAAGATTTGCAATCT 
TGTATAAATCTATATCTCGATCAGCTTAACGGAAGGATCGAGTCCATTAAAGAAAATGGT 
GAGTCGTTGTTGTC1TCCGTCTCTCCTTTTCCTACTAGAATTGGTGTTGACGAAATTGGT 
GATGAGTCATTTTCCGACTCTCCTATTCATGCTACAACTGGGGTTGTAGATACTCTTAAT 
GCTACCAATCCT(^TGTTCTTACGGGCGATATGACTCCTTTTCTTGATGCGGACGCAACT 
GCGGTAACTGCTTCCAGTAGATTTTTTGATCATATTCCATATGAAAATATGAATATGAGT 
CAAAATCTGCATGAACCGTTTCAACACCTTGTT^ 

AATCAGAATATGAATCAGGTTCAATACC^GGCTCCTAATAATCTGTTTAATCAGATTCAA 
CGAGAATTCTACAACATAAATTTGAATCTGAATTTGT^ATCTGAATTCGAATCAGTATCTG 
AATCAACAACAATCATTCATGAATCCGATGGTGGAACAACATATGAATCATGTTGGAGGG 
CGTGAAAGCATTCCTTTCGTGGACGGAAACTGCTACAACTACCATCAACTACCATCCAAT 
CAACTACCAGCCGTTGATCATGCTTCCACCAGTTAC^TGCCTTCCACCACCGGTGTCTAT 
GATCCTTACATCAACAATAATCTCTAA 

>G170 Amino Acid Sequence (domain in aa coordinates: 2-57) 

MGMKKVKLSLIANERSRKTSFIKRKDGIFKKLHELSTLCGVQACALIYSPFIPVPESWPS 

REGAKKVASRFLEMPPTARTKKMMDQETYLMERITKAK^ 

VEGKMSQYHYDAKDLQDLQSCINLYLDQLNGRIESIKENGESLLSSVSPFPTRIGVDEIG 

DE S FSDS P I HATTGVVDTLNATNPHVLTGDMTPFLDADATAVTAS SRFFDHI PYENMNMS 

QNLHEPFQHLVPTNVCDFFQNQNMNQVQYQAPNNLFNQIQREFYNIl^ 

NQQQSFMNPMVEQHMNHVGGRESIPFVDGNCYNYH^ 

DPYINNNL* 

>G1768 (185.. 1426) 

CTTCCTTTTGCTTCAGCTGCGAGCTTTGGTTGGATCTCTCACTTGCAAAACCAAATCCCT 
TATCGACTTCCACCGAAAGATCACTTCTTAACCTACACAAGGTGTTTGTTATGAAGATCA 
GATAAATAAAAGGTCATTTGAGGATAATGGTTGATGTTCAAAGATTCTTACTTGCTTATT 
TGTGATGGACAATGTAAGAGGTTCAATAATGTTGCAGCCACTGCCAGAGATAGCTGAGAG 
TATCGATGATGCTATCTGCCATGAACTCTCCATGTGGCCTGATGATGCTAAAGATTTGTT 
ATTGATAGTGGAGGCAATATCAAGGGGAGACTTGAAGTTGGTACTTGTTGCTTGTGCAAA 
AGCTGTTTCTGAGAATAATCTTCTAATGGCACGATGGTGTATGGGTGAGTTGCGCGGTAT 
GGTTTCGATTTCTGGTGAGCCAATCCAGAGATTGGGAGCTTATATGTTAGAAGGGCTTGT 
TGCTAGGCTTGCTGCTTCTGGTAGTTCGATATATAAGTCTCTCCAGTCCAGAGAACCAGA 
GAGTTATGAATTTTTATCTTATGTGTATGTTCTGCATGAGGTTTGTCCATATTTCAAGTT 
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TGGATACATGTCAGCGAATGGTGCGATTGCAGAAGCAATGAAGGATGAAGAGAGGATTCA 

CATTATTGACTTCCAAATTGGACAAGGGAGCCAGTGGATAGCACTTATCCAGGCTTTTGC 

AGCTAGGCCTGGTGGGGCTCCAAATATTCGAATTACCGGAGTTGGTGATGGATCTGTCTT 

GGTTACAGTCAAGAAGAGACTAGAGAAACTTGCAAAGAAGTTTGATGTTCCATTCAGGTT 

CAATGCGGTTTCAAGGCCAAGTTGTGAAGTTGAAGTGGAAAATCTTGATGTCCGAGATGG 

CGAAGCCCTTGGAGTGAACTTTGCTTACATGCTGCATC^TTTGCCAGATGAGAGTGTAAG 

CATGGAAAACCACAGGGACCGGTTGCTGAGGATGGTGAAGAGTCTATCACCTAAAGTAGT 

CACTCTTGTGGAACAAGAATGCAACACGAACACTTCCCCTTTCCTTCCTAGGTTCCTTGA 

GACATTAAGTTATTACACGGCAATGTTCGAATCTATCGATGTTATGCTTCCGAGAAATCA 

CAAGGAAAGGATCAATATCGAGCAGCACTGCATGGCAAGGGATGTCGTCAACATCATAGC 

TTGTGAAGGAGCCGAGAGGATCGAAAGACACGAGCTTCTCGGGAAATGGAAGTCAAGGTT 

TTCCATGGCGGGTTTTGAGCCATACCCCTTGAGCTCAATCATTTCAGCCA 

CCTCTTGAGAGATTACAGCAACGGGTATGCGATTGAAGAAAGAGATGGTGCTCTGTACCT 

TGGTTGGATGGACCGAATCTTGGTCTCATCTTGTGCATGGAAGTGAAAGAATAAACGTCT 

CCAAGAATGTAATGCAAAAGACAGAACTGGAAGTAATAGATAGTTTTGTCTCATAACCAT 

TAATAAGGTTGAATCAAATCATATACATCCCCATGCTAC7VACTATTACACAGGCTCCATC 

GGAAGTGGTTACAT 

>G1768 Amino Acid Sequence (domain in AA coordinates: 54-413) 

MDNVRGSIMLQPLPEIAESIDDAI(^LSI^PDDAKDLLLIVEAISRGDLKLVLVACAKA 

VSENl^LMARWCMGELRGWSISGEPIQRLGAYMLEGLVARIiAASGSSIYKSLQSREPES 

YEFLSYVYVLHEVCPYFKFGYMSANGAIAEAMKDEERIHIIDFQIGQGSQWIALIQAFAA 

RPGGAPNIRITGVGDGSVLVT\nCKRLEKLAKKPDVPPRFNAVSRPSCEVEVE 

ALGVNFAYMLHHLPDESVSMENHRDRLLRMVKSLSPKWTLVEQECNTNTSPFLPRFLET 

LSYYTAMFESIDVMLPRiraKERINIEQHCMARDVVNIIACEGAERIERHELLGKWKSRF^ 

MAGFEPYPLSSIISATIRALLRDYSNGYAIEERDGALYLGWMDRILVSSCAWK* 

>G185 (77.. 988) 

ATGCAAAAATAAACATAGTAACAATACTTTAAACTATTTACACCACTTTAATCTTATTCT 
CCACTCTTTGAACGTAATGGAGAAGAACCATAGTAGTGGAGAGTGGGAGAAGATGAAGAA 
CGAGATCAACGAGCTAATGATAGAAGGAAGAGACTATGCACACCAGTTTGGATCAGCTTC 
ATCTCAAGAAAC^CGTGAACATTTAGCCAAAAAGATTCTTCAATCTTACCACAAGTCTCT 
(^CCATCATGAACTACTCCGGCGAACTTGACC^GTTTCTCAGGGTGGAGGAAGCCCC^ 
GAGCGATGATTCCGATCAAGAACCACTTGTCATCAAGAGTTCGAAGAAGTCAATGCCAAG 
GTGGAGTTCAAAAGTCAGAATTGCCCCTGGAGCTGGTGTTGATAGAACGCTGGACGATGG 
ATTCAGTTGGAGAAAGTACGGCCAGAAGGATATTCTCGGAGCCAAATTTCCAAGAGGATA 
CTATAGATGCACGTATAGAAAGTCTCAAGGATGTGAAGCCACTAAACAAGTCCAAAGATC 
TGATGAAAATCAGATGCTCCTTGAGATCAGTTACCGAGGAATACATTCTTGCTCTCAAGC 
TGCAAATGTCGGTACAACAATGCCGATACAAAACCTCGAACCGAACCAGACCCAAGAACA 
CGGAAATCTTGACATGGTAAAGGAAAGTGTAGAC^ 

TCACAACCTTCACTATCCATTGTCATCTACCCCAAATCTAGAGAATAACAATGCCTATAT 
GCTTCAAATGCGAGATCAAAACATCGAATATTTTGGATCTACGAGCTTCTCTAGTGATCT 
AGGAACTAGTATCAACTACAATTTTCCAGCATCT^ 

CTCTCCGTCCACCGTCCCTTTGGAATCCCCGTTTGAAAGCTATGATCCAAATCATCCATA 
TGGAGGATTTGGTGGGTTCTATTCTTAGTTATCTACTTAAGGGAGGGACGGAACTTTTTA 
CATGACCTCTTGATTAAAGAGAGAGTTTTCATAATAGCTAATCAATTTCCTAT.TCAAATA 
TCCGAGTTTTTTTTCTAATCATGTTTATCAATTGTCTTATTACAGAAGGCTTATTTTCAG 
GTCTATGTTGAAATAAATGGATTTGTACTCGTAGGTATGATCCTTGTTATCTAAAAAAAA 
AAAAA — 

>G185 Amino Acid Sequence (domain in AA coordinates: 113-172) 
MEKNHSSGEWEKMKNEINELMIEGRDYAHQFG 

SGELDQVSQGGGSPKSDDSDQEPLVIKSSKKSMPRWSSKVRIAPGAGVDRTLDDGFSWRK 
YGQKDILGAKFPRGYYRCTYRKSQGCEATKQVQRSDENQMLLEISYRGIHSCSQAANVGT 
TMPIQNLEPNQTQEHGNLDMVKESVDNYNHQAHLH^ 

QNIEYFGSTSFSSDLGTSINYNFPASGSASHSASNSPSTVPLESPFESYDPNHPYGGFGG 
FYS* 

>G1931 (5.. 592) 

ATCAATGGAAGGGGTTGACAACACAAATCCTATGTTAACCCTAGAAGAAGGCGAAAACAA 
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CAATCCTTTTTCTTCCTTAGATGACATU^CATTAATGATGATGGCTCCTTCGTTAATCTT 

TTCGGGCGATGTAGGTCCATCTTCTTCTTCTTGTACTCCAGCAGGTTATCATCTATCTGC 

TCAGCTGGAGAACTTTCGAGGAGGTGGAGGAGAGATGGGAGGATTAGTGAGTAATAATAG 

CAATAATAGTGATCATAATAAGAATTGCAACAAAGGAAAAGGGAAGAGAACTTTGGCAAT 

GCAGAGGATAGCTTTTCATACAAGGAGTGATGATGATGTTCTTGATGATGGTTATCGTTG 

GCGAAAGTACGGTCAGAAATCTGTCAAGAACAATGCTCATCCCAGGAGCTATTATAGATG 

TACATACCACACATGCAACGTGAAGAAACAAGTGCAAAGACTGGCAAAAGATCCAAACGT 

TGTCGTAACAACCTACGAAGGTGTTCATAATCATCCTTGTGAGAAGCTCATGGAGACTCT 

TAGCCCTCTCCTTAGGCAACTTCAGTTCCTCTCAAGAGTTTCTGATCTGTAATTATTGAA 

TGTTAATTAGTGGTGTAATACATTAATTATGCTTTAATCTCTCCATTGACCCTCAATC 

>G1931 Amino Acid Sequence (domain in AA coordinates: 114-170) 

MEGVDNTNPMLTLEEGENNNPFSSLD^ 

LENFRGGGGEMGGLVSNNSNNSDHNKNCNKGKGKRTLT^MQRIAFHTRSDDDVLDDGYRWR 
KYGQKSVKNNAHPRSYYRCTYHTO^VKKQVQRLAKDPNW 
PLLRQLQFLSRVSDL * 
>G2543 (1..2169) 

ATGAGTTTCGTCGTCGGCGTCGGCGGAAGTGGTAGTGGAAGCGGCGGAGACGGTGGTGGT 

AGTCATCATCACGACGGCTCTGAAACTGATAGGAAGAAGAAACGTTACCATCGTCACACC 

GCTCAACAGATTCAACGCCTTGAATCGAGTTTCAAGGAGTGTCCTCATCCAGATGAGAAA 

CAGAGGAACCAGCTTAGC^GAGAATTGGGTTTGGCTCCAAGACAAATCAAGTTCTGGTTT 

CAGAACAGAAGAACTC^GCTTAAGGCTCAACATGAGAGAGCAGATAATAGTGCACTAAAG 

GCAGAGAATGATAAAATTCGTTGCGAAAACATTGCTATTAGAGAAGCTCTCAAGCATGCT 

ATATGTCCTAACTGTGGAGGTCCTCCTGTTAGTGAAGATCCTTACTTTGATGAACAAAAG 

CTTCGGATTGAAAATGCACACCTTAGAGAAGAGCTTGAAAGAATGTCTACCATTGCATCA 

AAGTACATGGGAAGACCGATATCGCAACTCTCTACGCTACATCCAATGCACATCTCACCG 

TTGGATTTGTCAATGACTAGTTTAACTGGTTGTGGACCTTTTGGTCATGGTCCTTCAC 

GATTTTGATCTTCTTCCAGGAAGTTCTATGGCTGTTGGTCCTAATAATAATCTGCAATCT 

CAGCCTAACTTGGCTATATCAGACATGGATAAGCCTATTATGACCGGCATTGCTTTGACT 

GCAATGGAAGAATTGCTCAGGCTTCTTCAGACUUUVTC 

GGCTGCAGAGACATTCTCAATCTTGGTAGCTATGAGAATGTTTTCCCAAGATC71AGTAAC . 

CGAGGGAAGAACCAGAACTTTCGAGTCGAAGCATCAAGGTCTTCTGGTATTGTCTTCATG 

AATGCTATGGCACTTGTCGACATGTTCATGGATTGTGTCAAGTGGACAGAACTCTTTCCC 

TCTATCATTGCAGCTTCTAAAACACTTGCAGTGATTTCTTCAGGAATGGGAGGTACCCAT 

GAGGGTGCATTGCATTTGTTGTATGAAGAAATGGAAGTGCTTTCGCCTTTAGTAGCAACA 

CGCGAATTCTGCGAGCTACGCTATTGTC^ACAGACTGAACAAGGAAGCTGGATAGTTGTA 

AACGTCTCATATGATCTTCCTCAGTTTGTTTCTCACTCTCAGTCCTATAGATTTCCATCT 

GGATGCTTGATTCAGGATATGCCCAATGGATATTCCAAGGTTACTTGGGTTGAACATATT 

GAAACTGAAGAAAAAGAACTGGTTCATGAGCTATACAGAGAGATTATTCACAGAGGGATT 

GCTTTTGGGGCTGATCGTTGGGTTACCACTCTCCAGAGAATGTGTGAAAGATTTGCTTCT 

CTATCGGTACCAGCGTCTTCATCTCGTGATCTCGGTGGAGTGATTCTATCACCGGAAGGG 

AAGAGAAGCATGATGAGACTTGCTCAGAGGATGATC^GCAACTACTGTTTAAGTGTCAGC 

AGATCCAACAACACACGCTCAACCGTTGTTTCGGAACTGAACGAAGTTGGAATCCGTGTG 

ACTGCACATAAGAGCCCTGAACCAAACGGCACAGTCCTATGTGCAGCCACCACTTTCTGG 

CTTCCCAATTCTCCTCAAAATGTCTTCAATTTCCTCAAAGACGAAAGAACCCGTCCTCAG 

TGGGATGTTCTTTCAAACGGAAACGCAGTGCAAGAAGTTGCTCACATCTCAAACGGATCA 

CATCCTGGAAACTGCATATCGGTTCTACGTGGATCCAATGCAACACATAGCAACAACATG 

CTTATTCTGCAAGAAAGCTCAACAGACTCATCAGGAGCATTTGTGGTCTACAGTCCAGTG 

GATTTAGCAGCATTGAACATCGCAATGAGCGGTGAAGATCCTTCTTATATTCCTCTCTTG 

TCCTCAGGTTTCACAATCTCACCAGATGGAAATGGCTCAAACTCTGAACAAGGAGGAGCC 

TCGACGAGCTCAGGACGGGCATCAGCTAGCGGTTCGTTGATAACGGTTGGGTTTCAGATA 

ATGGTAAGCAATTTACCGACGGCAAAACTGAATATGGAGTCGGTGGAAACGGTTAATAAC 

CTGATAGGAACAACTGTACATCAAATTAAAACCGCCTTGAGCGGTCCTACAGCTTCAACT 

ACAGCTTGA 

>G2543 Amino Acid Sequence (domain in AA coordinates: 31-91) 
MSFVVGVGGSGSGSGGDGGGSHHHDGSETDRKKKRYHRHTAQQIQRLESSFKECPHPDEK 
QRNQLSRELGLAPRQIKFWFQNRRTQLKAQHERADNSALKAENDKIRCENIAIREALK^ 
ICPNCGGPPVSEDPYFDEQKLRIENAHLREELERMSTIASKYMGRPISQLSTLHPMHISP 
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LDLSMTSLTGCGPFGHGPSLDFDLLPGSSMAVGPNNNLQSQPNLAISDMDKPIMTGIALT 
AMEELLRLLQTNEPLWTRTDGCRDILNLGSYENVFPRSSNRGKNQNFRVEASRSSGIVFM 
NAMALVDMFMDCVKWTELFPS 1 1 AAS KTLAVI S SGMGGTHEGALHLL YEEMEVLS PLVAT 
REFCELRYCQQTEQGSWIVVNVSYDLPQFVSHSQSYRFPSGCLIQDMPNGYSKVTWVEHI 
ETEEKELVHELYREIIHRGIAFGADRWVTTLQRMCERFASLSVPASSSRDLGGVILSPEG 
KRSMMRLAQRMISNYCLSVSRSNNTRSTWSELNEVGI^ 

LPNSPQNVFNFLK3DERTRPQVTOVLSNGNAVQEVAHISNGSHPGNCISVLRGSNATHSNNM 
LILQESSTDSSGAJVVYSPVDLAALNIAMSGEDPSYIPLLSSGFTISPDGNGSNSEQGGA 
STSSGRASASGSLITVGFQIWSNLPTAKIiNMESVETVNNLIGTTVHQIKTALSGPTAST 
TA* 

>G264 (30.. 1430) 

CTTGTACCAGTTTCTGATTAGATTCAACAATGAACGGCGCATTAGGTAACTCCTCCGCCT 
CCGTTAGCGGCGGAGAAGGAGCCGGAGGACCAGCGCCTTTCTTGGTGAAAACCTACGAGA 
TGGTCGACGATTCATC^UVCGGACCAGATCGTATCGTGGAGCGCTAACAACAACAGCTTCA 
TCGTTTGGAATCATGCCGAATTTTCACGCCTCCTTCOT 

ACTTCTCTTCCTTCATTCGTCAGCTCAATACCTATGGGTTTAGGAAGATTGATCCAGAGA 

GGTGGGAGTTTTTGAATGATGATTTTATTAAGGATCAGAAGCATCTTCTCAAGAATATAC 

ATAGAAGGAAACCTATACACAGCCACAGTCATCCACCTGCTTCGTCGACTGATCAAGAAA 

GAGCAGTGTTGCAAGAGCAAATGGACAAGCTTTCACGTGAGAAAGCTGCAATTGAAGCTA 

AGCTTTTAAAGTTC AAACAAC AGAAGGTTGTAG CAAAGCATCAGTTTGAAG AAATGAC TG 

AGCATGTTGATGATATGGAGAATAGGCAGAAGAAGCTGCTGAATTTTTTGGAAACTGCGA 

TTCGGAATCCTACTTTTGTTAAGAATTTTGGTAAGAAAGTCGAGCAGTTGGATATTTCAG 

CTTACAACAAAAAGCGAAGGCTCCCTGAAGTTGAGCAATCAAAGCCACCTTCAGAAGATT 

CTCATCTGGATAATAGTAGTGGTAGCTCGAGACGCGAGTCTGGAAACATTTTTCATCAAA 

ATTTCTCTAATAAATTGCGACTAGAGCTTTCTCCAGCTGATTCAGATATGAACATGGTTT 

CACACAGTATACAAAGTTCCAATGAAGAAGGTGCGAGTCCCAAAGGGATACTGTCAGGAG 

GTGATCCAAATACTACACTAACAAAAAGAGAAGGCCTACCATTTGCACCTGAAGCTCTAG 

AGCTTGCGGATACCGGGACATGCCCGAGGAGATTACTGTTAAATGATAATACAAGGGTGG 

AGACCTTGCAGCAGAGGCTAACTTCTTC^GAGGAGACTGATGGTAGCTTTTC^TGTCATT 

TAAATCTAACCCTGGCTTCTGCTCCGTTACCGGACAAAACAGCTTCACAGATAGCTAAGA 

CGACTCTTAAAAGTCAGGAGTTAAACTTTAACTCAATAGAAACAAGTGCAAGTGAGAAAA 

ATCGGGGTAGACAAGAGATTGCAGTTGGAGGTAGCCAAGCAAATGCAGCTCCTCCAGCAA 

GAGTGAATGATGTATTCTGGGAACAGTTCCTAACAGAAAGGCCAGGGTCTTCAGATAATG 

AGGAGGCAAGTTCGACTTATAGAGGTAACCCATACGAAGAGCAAGAGGAGAAAAGAAACG 

GGAGTATGATGTTACGTAATACAAAGAATATCGAGCAGCTGACCTTATAAACTATTTGGA 

CGGTTACATCAACGAGAGTACGAACTGAGGTTTTGGTAAGAAGTATGGGTGAGTAAGTAA 

TGAAACATTGGACTGAAAAAGCGTAAGTAGCTTTGTTGTAAACACTTGCGTCTCTGTCTA 

CACAAGTAATTTGACTGTAAATGTAAGTGTACAGGATTTAAATTGAATAAGCA 

>G264 Amino Acid Sequence (domain in AA coordinates: 24-114) 

MNGALGNSSA5VSGGEGAGGPAPFLVKTYEMVDDSSTDQIVSWSANNNSFIVWNHAEFSR 

LLLPTYFKHNNFSSFIRQLNTYGFRKIDPERWEFLNDDFiro 

HPPASSTDQERAVLQEQMDKLSREKAAIEAKLLKFKQQKWAKHQFEEMTEHVDDMENRQ 

KKLLNFLETAIRNPTFVKNFGKKVEQLDISAYNKKRRLPEVEQSKPPSEDSHLDNSSGSS 

RRESGNIFHQNFSNKLRLELSPADSD^INMVSHSIQSSNEEGASPKGILSGGDPNTTLTKR 

EGLP FAPEALEIADTGTCPRRLLLNDNTRVETLQQRLTS SEETDGS FS CHLNLTLAS APL 

PDKTAS QIAKTTLKSQELNFNS I ETS AS EKNRGRQE I AVGGS QANAAPPARVNDVFWEQF 

LTERPGSSDNEEASSTYRGNPYEEQEEKRNGSMMLRNTKNIEQLTL* 

>G32 (101.. 736^- 

AACACACATTCCCTCTCTTCCTTCAACTAGAAAiUVAGATAGATATATCGGAC^TTTATTG 
ATCTGTGTATGCATAAAGGTATAGTATCATTATTAGAAAGATGAACACAACATCATCAAA 
GAGCAAGAAGAAGCAAGACGATCAGGTTGGTACAAGGTTTCTTGGGGTGAGAAGAAGGCC 
TTGGGGAAGATACGCAGCTGAGATTAGAGACCCAACTACGAAGGAGCGTCACTGGCTTGG 
CACTTTCGATACGGCGGAAGAAGCTGCCTTGGCCTACGATAGAGCTGCTCGGTCCATGCG 
TGGCACACGTGCCAGAACCAACTTTGTTTACTCAGACATGCCTCCTTCCTCATCCGTCAC 
CTCCATTGTTTCTCCTGACGATCCTCCTCCTCCTCCACCTCCTCCTGCTCCTCCTAGCAA 
TGATCCTGTCGATTACATGATGATGTTTAACCAATACTCATCCACTGACTCGCCAATGCT 
TCAGCCTCATTGTGATCAAGTGGACAGTTACATGTTTGGTGGCTCTCAATCTTCGAATTC 
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TTATTGCTATTCTAATGACAGTAGTAATGAGCTGCCTCCTCTCCCGAGCGACTTGTCGAA 
TTCGTGTTATAGCCAACCACAGTGGACCTGGACCGGTGACGACTACTCGTCTGAGTACGT 
ACATAGTCCAATGTTCAGCAGAATGCCTCCGGTTTCTGACTCTTTCCCTCAAGGTTTCAA 
CTACTTTGGCTCCTAATTCTTTCTCATCGTCCATATTTAATACCTTCCTCATTTGTACCT 
TTTCCTTCTTCTTCTTTTTTGGGTTTATCTATGTTTCGCCGTCCTTGATCTCTGCCTATG 
TGATCAAAGTGACTGTTTGT(^TTAGTTTTTCAATAACAAGTTATCATTTGTATCTTGAA 
AAAAAAAAAAA 

>G32 Amino Acid Sequence (domain in aa coordinates: 17-84) 

MNTTS S KSKKKQDDQVGTRFLGVRRRPWGRYAAE IRDPTTKBRHWLGTFDTAEEAALAYD 

RAARSMRGTRARTNFVYSDMPPSSSVTSIVSPDDPPPPPPPPAPPSNDPVDYMMMFNQYS 

STDSPMLQPHCDQVDSYMFGGSQSSNSYCYSNDSSNELPPLPSDLSNSCYSQPQWTWTGD 

DYSSEYVHSPMFSRMPPVSDSFPQGFNYFGS* 

>G436 (1..2157) 

ATGGATTTTACTCGCGATGACAACTCAAGTGATGAACGGGAAAATGATGTAGACGCCAAC 
ACCAACAACCGTCACGAGAAGAAGGGTTACCATCGCCACACTAATGAACAAATTCATAGG 
CTTGAAACGTATTTCAAGGAATGTCCTCATCCAGACGAATTTCAGCGACGTCTGTTGGGT 
GAAGAACTGAATCTGAAACCAAAACAAATC^AATTTTGGTTTCAAAACAAAAGAACTCAA 
GCTAAGAGTCACAATGAAAAAGCAGACAATGCAGCGCTTAGGGCAGAAAATATTAAGATT 
AGACGTGAGAACGAATCAATGGAAGATGCACTGAATAATGTGGTTTGCCCTCCATGTGGT 
GGTCGTGGTCCTGGGAGAGAAGACCAACTTCGACATCTCCAAAAACTCCGTGCACAAAAC 
GCTTATCTCAAAGATGAGTATGAAAGAGTCTCAAACTACCTAAAACAGTACGGAGGTCAC 
TCAATGCATAACGTCGAGGCCACACCCTATCTCCATGGTCCATCAAACCATGCATCAACG 
TCCAAGAACCGTCCAGCATTGTACGGAACCTCTTCTAACCGTCTCCCCGAGCCTTCAAGC 
ATATTTAGAGGACCATACACTCGTGGAAACATGAACACCACCGCACCGCCTCAGCCGCGA 
AAGCCGCTGGAAATGCAGAATTTCCAACCACTATCTCAACTGGAGAAAATTGCAATGTTG 
GAAGCAGCGGAAAAAGCGGTGTCAGAGGTTTTGAGCCTCATTCAAATGGATGATACAATG 
TGGAAAAAGTCGTCTATTGATGATAGGCTCGTCATTGATCCAGGGCTCTATGAGAAATAT 
* TTTACTAAGACTAACACAAATGGTCGTCCTGAGTCTTCTAAAGATGTCGTGGTGGTTCAA 
ATGGATGCTGGAAACTTGATCGACATCTTCTTAACTGCGGAGAAATGGGCGAGGCTTTTT 
CCAACAATTGTGAACGAAGCTAAAACGATTCACGTCTTGGATTCCGTTGACCATCGAGGA 
AAAACTTTCTCAAGAGTGATTTATGAGCAACTGCACATACTGTCACCATTGGTGCCACCG 
AGGGAATTTATGATCCTAAGGACTTGCCAACAAATTGAAGACIAATGTCTGGATGATTGCT 
GATGTGTCGTGTCATCTCCCAAACATTGAGTTTGATCTTTCGTTTCCCATTTGCACCAAA 
CGTCCCTCAGGTGTGCTCATTCAAGCCTTGCCCCACGGCTTCTCTAAGGTGACGTGGATA 
GAGCATGTGGTAGTGAATGATAATAGAGTGCGGCCACATAAGCTTTACAGAGACCTCTTA 
TACGGCGGCTTTGGCTACGGAGCTCGACGTTGGACCGTTACTCTTGAGAGGACGTGTGAG 
AGGCTGATTTTCTCCACCTCCGTCCCTGCCTTGCCCAACAATGACAATCCCGGAGTTGTG 
CAAACAATACGAGGCAGAAATAGCGTAATGCATTTGGGAGAAAGAATGTTGAGGAACTTT 
GCATGGATGATGAAAATGGTTAACAAACTCGACTTCTCGCCACAGTCTGAAACTAACAAC 
AGCGGAATTAGGATTGGGGTGCGGATAAACAATGAGGCGGGTCAACCGCCCGGTCTCATT 
GTCTGTGCTGGTTCATCTTTATCCCTCCCTCTCCCTCCTGTCCAAGTGTACGATTTCCTT 
AAGAATCTGGAGGTTCGTCACCAGTGGGACGTTCTGTGCCATGGGAATCCAGCGACTGAG 
GCTGCTCGTTTCGTCACCGGATCAAACCCAAGGAACACTGTGTCTTTTCTCGAGCCTTCA 
ATTAGGGATATTAATACTAAGCTAATGATACTCCAAGATAGCTTCAAAGATGCATTGGGA 
GGAATGGTGGCCTACGCTCCAATGGATCTAAACACCGCCTGCGCTGCCATTTCAGGCGAT 
ATCGATCCTACCACCATTCCAATCCTCCCTTCCGGTTTTATGATCTCCCGTGACGGCCGT 
CCTTCCGAGGGCGAAGCCGAGGGTGGCAGCTATACACTCCTCACCGTGGCTTTCCAGATC 
CTTGTCTCCGGTCC8AGTTACTCTCCTGATACCAACCTGGAAGTTTCTGCCACCACAGTC 
AATACCTTGATTAGCTCCACCGTTCAAAGGATCAAAGCCATGCTCAAGTGCGAATGA 
>G436 Amino Acid Sequence (domain in AA coordinates: 22-85) 
MDFTRDDNSSDERElSroVDANTNNRHEKKGYHRHTNEQIHRLETYFKECPH 
EELNLKPKQIKFWFQNKRTQAKSHNEKADNAALRAENI KI RRENE SMEDALNNWCPPCG 
GRGPGREDQLRHLQKLRAQNAYLIODEYERVSNYLKQYGGHSMHNVEATPYLHGPSNHAST 
SKNRPALYGTSSNRLPEPSSIFRGPYTRGimNTTAPPQPRKPLEMQNFQPLSQLEKIAML 
EAAEKAVS EVLS L I QMDDTMWKKS S IDDRLVIDPGLYEKY FTKTNTNGRPES S KDWWQ 
MDAGNLID I FLTAEKWARLFPT I WEAKTIHVLDS VDHRGKTFSRVI YEQLHILSPLVPP 
REFMILRTCQQIEDNWMIADVSCHLPNIEFDLSFPICTKRPSGVLIQALPHGFSKVTWI 
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EHVWNDNRVRPHKLYRDLLYGGFGYGARRWTVTLERTCERLIFSTSVPALPNNDNPGW 
QTIRGRNSVMHLGERMLRNFAWMMKI^ 

VCAGSSLSLPLPPVQVYDFLKNLEVI^QWDVLOTGNPATEAARFVTGSNPRNTVSFLEPS 
IRDINTKLMILQDSFKDALGGMVAYAPMDLNTACAAISGDIDPTTIPILPSGFMISRDGR 
P S EGEAEGGS YTLLTVAFQI LVSGP S YS PDTNLE VSATTVNTL I S STVQRI KAMLKCE * 
>G556 (50.. 1144) 

CTTTTTTGAAGCCCTTTTGACACAAAAGACCAGAACAAGTTGAAGAAATATGAATACAAC 
CTCGACACATTTTGTTCCACCGAGAAGGTTTGAAGTTTACGAGCCTCTCAACCAAATCGG 
TATGTGGGAAG7U\AGTTTCAAGAACAATGGAGACATGTATACGCCTGGCTCTATCATAAT 
CCCGACTAACGAAAAACC^GACAGCTTGTCAGAGGATACTTCTCATGGGACAGAAGGAAC 
TCCTCACAAGTTTGACCAAGAGGCTTCCACATCTAGACATCCTGATAAGATACAGAGAAG 
GCTAGCACAGAATCGAGAGGCAGCTAGGAAAAGTCGTTTGCGCAAGAAAGCTTATGTTCA 
GCAGCTAGAGACTAGCCGGTTAAAGCTAATTCATTTAGAGCAAGAACTCGATCGTGCTAG 
ACAACAGGGTTTCTATGTGGGGAACGGAGTAGATACCAATGCTCTTAGTTTCTCAGATAA 
CATGAGCTCAGGGATTGTTGCATTTGAGATGGAATATGGACATTGGGTGGAAGAACAGAA 
CAGGCAAATATGTGAACTAAGAACGGTTTTACATGGACAAGTTAGTGATATAGAGCTTCG 
TTCTCTAGTCGAGAATGCCATGAAACATTACTTTCAACTCTTCCGAATGAAGTCAGCCGC 
TGCAAAAATCGATGTTTTCTATGTCATGTCCGGAATGTGGAAAACTTCAGCAGAGCGGTT 
TTTCTTGTGGATAGGCGGATTTAGACCCTCAGAGCTTCTCAAGGTTCTGTTACCGCATTT 
TGATCCTTTGACGGATCAACAACTTTTGGATGTATGTAATCTGAGGCAATCATGTCAACA 
ATCAGAAGATGCGTTATCCCAAGGTATGGAGAAACTGCAACATACATTAGCAGAGAGTGT 
AGCAGCCGGGAAACTTGGTGAAGGAAGTTATATTCCTCAAATGACTTGTGCTATGGAGAG 
ATTGGAGGCTTTGGTCAGCTTTGTAAATCAAGCTGATCATCTGAGACATGAGACATTGCA 
ACAGATGCATCGGATCTTAACCACGCGACAAGCGGCTAGAGGTTTGTTAGCATTAGGGGA 
GTATTTCCAAAGGCTTCGAGCTTTGAGl^CGAGTTGGGCGGCTAGGCAACGTGAACCyVAC 
GTAATTAAGGTGTTTAGATGTCAAGAAAGGTTTGAGACCTTAACAATCAAGAATGGAGTT 
TGCTGGTGAGTGGATTTTTGGGTCAAGAACAAGAGCAATAACACAAGCTGCTGTGTGATG 
ATGAATCTTGTCTTGCGGCTAAAGGAAATGTTTGAGGAAAGTTGTACATATGATCAGCAA 
CGTAAAGTTTATAGCTTTTTAGAAACCAACTTTTCGATGGTTGTTCTTTTTTTTTTGTAT 
GTAATATTATAGATAAGCTTGTGGTATATATGATTTTAATGTGACATTACGAACTTGATT 
TATAACCATGGTAAAAT 

>G556 Amino Acid Sequence (domain in AA coordinates: 83-143) 
MNTTSTHFVPPRRFEVYEPLNQIGMWEESFKNNGD^ 

TEGTPHKFDQE^TSRHPDKIQRRIiAQNREAARKSRLRKKAYVQQLETSRLKLIHLEQEL 
DRARQQGFYVGNGVDTNALSFSDNMSSGIVAFEMEYGHWVEEQNRQICELRTVLHGQVSD 
IELRSLVENAMKHYFQLFRMKSAAAKIDVFYVMSGMWKTSAERFFLWIGGFRPSELLKVL 
LPHFDPLTDQQIiLDVCNLRQSCQQSEDALSQGMEKLQHTIiAESVAAGKLGEGSYIPQMTC 
AMERLEALVSFVNQADHLRHETLQQMHRIIiTTRQAARGLLALGE YFQRLRALS S SWAARQ 
RBPT* 

>G1420 (39.. 1238) 

AAAGTATCATCTCATAGATTCCATCTTTTCTCTATTACATGGAGAAGAAAAAAGAAGAGG 
ATCATCATCATCAACAACAACAACAACAACAAAAGGAGATCAAGAACACAGAGACAAAGA 
TCGAGCAAGAACAAGAACAAGAACAAAAACAAGAAATCTCTCAAGCATCATCATCATCAA 
ACATGGCGAATCTAGTTACGTCATCAGATCATCATCCGTTGGAGCTAGCTGGAAATCTCT 
CAAGCATCTTCGATACTTCATC1TTACCTTTTCCTTATTCTTATTTCGAAGATCACTCTT 
CTAATAATCCTAATTCTTTCCTAGACTTGCTCCGACAAGATCATCAGTTTGCTTCTTCCT 
CTAATTCCTCTTCTTTTTCATTCGATGCCTTTCCTCTCCCCAATAACAACAACAACACCT 
CTTTTTTTACGGATTTGCCCTTACCTCAAGCTGAGTCATCAGAAGTCGTGAAC^a^CAC 
CGACTTCTCCAAACTCAACCTCAGTCTCATCTTCCTCCAACGAAGCTGCAAATGATAACA 
ACAGTGGTAAAG AAGT TACTGTTAAAGATCAAG AAGAAGGAG AT CAACAACAAGAGCAAA 
AGGGTACTAAGCCACAGTTGAAGGCAAAGAAGAAGAATCAA^GAAAGCTAGAGAAGCTA 
GGTTTGCGTTTCTGACGAAGAGCGATATTGATAATCTTGACGACGGTTATAGGTGGAGAA 
AATACGGCCAAAAAGCTGTCAAAAACAGTCCTTATCCCAGAAGCTATTACCGTTGCACCA 
CAGTGGGTTGCGGAGTGAAGAAGAGAGTGGAGAGATCCTCCGATGATCCTTCGATCGTCA 
TGACAACCTACGAAGGTCAGCATACCCATCCTTTCCCCATGACGCCACGTGGACACATCG 
GAATGCTCACGTCACCAATCCTAGACCACGGTGCAACCACCGCGTCATCATCATCATTCT 
CCATCCCTCAGCCACGTTACTTGCTGACTCAACATCACCAGCCCTACAACATGTACAACA 
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ACAACTCTCTAAGTATGATCAATAGAAGATCATCCGATGGCACTTTCGTAAATCCAGGTC 

CATCATCATCATTCCCCGGC1TTGGTTATGATATGTCTCAAGCTTCTACTTCAACTTCTT 

CTTCCATTAGAGATCATGGATTGCTTCAAGATATTCTTCCTTCGCAGATCAGATCCGATA 

CTATTAACACTCAAACCAATGAAGAGAATAAGAAATGAAGAAGTTTTTTTTCCCGGGGCA 

ATTGTTTTTTTCTTTAGGCCGGATCCGGTAGGTAGGTTTCATGAGC 

>G1420 Amino Acid Sequence {domain in AA coordinates: 221-280) 

MEKKKEEDHHHQQQQQQQKEIKNTETKIEQEQEQEQKQEISQASSSSNMANLVTSSDHHP 

LEIiAGNLSSIFDTSSLPFPYSYFEDHSSNNPNSFLDLLRQDHQFASSSNSSSFSFDAFPL 

PNNNNNTSFFTDLPLPQM^ 

GDQQQEQKGTKPQLKAKKKNQKKAREARFAFLTKSDIDNLDDGYRWRKYGQKAVKNSPYP 
RSYYRCTTVGCGVKKRVERSSDDPSIVMTTYEGQHTHPFPMTPRGHIGMLTSPILDHGAT 
TAS S SS FS I PQPRYLLTQHHQPYNMYNNNSLSMINRRSSDGTFVNPGPSSS FPGFGYDMS 
QASTSTS S S IRDHGLLQDILPSQIRSDTINTQTNEENKK* 
>G1412 (115. .1008) 

CCCACGCGTCCGCCC^CGCGTCCGAAACAAAAACAT 

AACTTGAAATCTTTTTTTTTTTGGTTGCTGAGGAATCGAAGTAGAAGAGTATAAATGGGT 

GTTAGAGAGAAAGATCCGTTAGCCCAGTTGAGTTTGCCACCAGGTTTTAGATTTTATC 

ACAGATGAAGAGCTTCTTGTTCAGTATCTATGTCGGAAAGTTGCAGGCTATCATTTCTCT 

CTCCAGGTCATCGGAGACATCGATCTCTACAAGTTCGATCCTTGGGATTTGCCAAGTAAG 

GCTTTGTTTGGAGAGAAGGAATGGTATTTCTTTAGCCCAAGAGATCGGAAATATCCGAAC 

GGGTCAAGACCCAATAGAGTAGCCGGGTCGGGTTATTGGAAAGCAACGGGTACTGACAAA 

ATTATCACGGCGGATGGTCGTCGTGTCGGGATTAAAAAAGCTCTGGTCTTTTACGCCGGA 

AAAGCTCCCAAAGGCACTAAAACCAACTGGATTATGCACGAGTATCGCTTAATAGAACAT 

TCTCGTAGCCATGGAAGCTCCAAGTTGGATGATTGGGTGTTGTGTCGAATTTACAAGAAA 

ACATCTGGATCTCAGAGACAAGCTGTTACTCCTGTTCAAGCTTGTCGTGAAGAGCATAGC 

ACGAATGGGTCGTCATCGTCTTCTTCATCACAGCTTGACGACGTTCTTGATTCGTTCCCG 

GAGATAAAAGACCAGTCTTTTAATCTTCCTCGGATGAATTCGCTCAGGACGATTCTTAAC 

GGGAACTTTGATTGGGCTAGCTTGGCAGGTCTTAATCCAATTCCAGAGCTAGCTCCGACC 

AATGGATTACCGAGTTACGGTGGTTACGATGCGTTTCGAGCGGCGGAAGGTGAGGCGGAG 

AGTGGGCATGTGAATCGGCAGC^GAACTCGAGCGGGTTGACTCAGAGTTTCGGGTACAGC 

TCGAGTGGGTTTGGTGTTTCGGGTCAAACATTCGAGTTTAGGCAATGAGAGAGATGTGAA 

GTTACTGATGGGTGAAAAAAGTAAAAAAAAAACTTGGAGATAGTAGAGTGGCAATTGATG 

TAAATAATAGGGATTTATATGGGGCTTTTACCGATTCGGTGAGGCTTAGGATTCCCCAAA 

GGAAAAAGGCTCGACTGGGGACTAGTTTGATCCAACTTGACGGCCCCCAAATGTGTAATG 

TTTCTCAACGGAGAGAAAAATAAATGGTTACCAATATTTTTCCAAAAAAAT^AAAAAAT^ 

>G1412 Amino Acid Sequence (domain in AA coordinates: 17-159) 

MGVREKDPLAQLSLPPGFRFYPTDEELLVQYLCRKVAGYHFSLQVIGDIDLYKFDPWDLP 

S KALFGEKEWYFFSPRDRKYPNGSRPNRVAGSGYWKATGTDKI ITADGRRVGIKKALVFY 

AGKAPKGTKTNWIMHEYRLIEHSRSHGSSKLDDWVLCRIYKKTSGSQRQAVTPVQACREE 

HSTNGSSSSSSSQI^DVLDSFPEIKDQSFNLPRMNSLRTILNGNFDWASIiAGIjNPIPEIiA 

PTNGLPS YGGYDAFRAAEGEAESGHVNRQQNS SGLTQS FGYS S S GFGVS GQTFEFRQ * 

>G738 (1..885) 

ATGGACCATCATCAGTATCATCATCATGATCAATACCAACATCAGATGATGACTAGTACT 
AACAATAATTCCTATAACACCATCGTCACAACACAACCACCACCAACAACAACAACAATG 
GATTCAACAACAGCAACAACTATGATAATGGATGACGAGAAGAAGTTGATGACGACAATG 
AGCACTAGGCCGGAAGAACCAAGAAACTGTCCAAGATGCAACTCAAGCAACACCAAGTTT 
TGTTATTACAACAACTACAGCTTAGCAC^GCCTAGGTACTTGTGTAAGTCTTGTCGGAGA 
TATTGGACTGAAGGTCGCTCTCTCCGTAACGTCCCCGTAGGCGGAGGTTCTAGAAAGAAC 
AAGAAGCTTCCATTTCCTAATTCCTCTACTTCTTCTTCCACCT^AGAACCTCCCGGATCTC 
AACCCTCCTTTCGTCTTCACATCATCAGCTTCATCATCAAACCCTAGCAAGACGCATCAA 
AACAATAATGACCTCAGCCTATCCTTCTCCTCCCCTATGCAAGACAAGCGAGCTCAAGGG 
CATTACGGTCATTTCAGTGAGCAAGTTGTGACAGGAGGGCAGAACTGTCTTTTCCAAGCT 
CCTATGGGAATGATTCAGTTTCGTCAAGAGTATGATCATGAGCACCCCAAAAAGAATCTT 
GGGTTTTCATTAGAC^GGAACGAGGAAGAGATTGGTAATCATGATAACTTCGTTGTTAAT 
GAGGAAGGAAGTAAGATGATGTATCCTTATGGAGATCATGAAGACCGTCAACAACATCAC 
CATGTGAGACACGATGATGGTAATAAGAAGAGAGAAGGTGGTTCAAGCAATGAGCTATGG 
AGCGGAATCATCCTAGGTGGTGATAGTGGTGGACCAACATGGTGA 
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>G738 Amino Acid Sequence (domain in aa coordinates: 351-393) 
MDHHQ YHHHDQ YQHQMMT S TNNNS YNT I VTTQP P PTTTTMDS TTATTM I MDDE KKLMTTM 
STRPQEPRNCPRCNSSNTKFCYYNNYSLAQPRYLCKSCRRYWTEGGSLRNVPVGGGSRKN 
KKLPFPNS STS S STKNLPDLNPPFVFTSS ASS SNPSKTHQNNM)IjSLS FS S PMQDKRAQG 
HYGHFSEQWTGGQNCLFQAPMGMIQFRQEYDHEHPKKNLGFSIJDRNEEEIGNHDNFVVN 
EEGSKMMYPYGDHEDRQQHHHVRHDDGNKKREGGSSNELWSGIILGGDSGGPTW* 
>G2426 (1..1038) 

ATGGGCAGATCGCCATGTTGTGATAAGGCCGGGTTGAAGAAAGGGCCTTGGACTCCAGAA 
GAGGATCAGAAACTTTTGGCTTATATTGAAGAACATGGCCATGGAAGCTGGCGTTCTTTG 
CCTGAGAAAGCCGGTCTCCAAAGGTGTGGAAAGAGTTGCAGACTCAGATGGACTAACTAC 
CTAAGACCTGACATCAAGAGAGGCAAATTCACTGTACAAGAAGAACAAACCATCATTCAA 
CTCCACGCTCTCCTCGGAAACAGGTGGTCAGCGATTGCAACTCATTTACCAAAGAGGACA 
GACAACGAGATC^AGAACTACTGGAACACACACTTGAAGAAACGTCTGATCAAAATGGGG 
ATAGATCCAGTGACTCACAAGCACAAAAACGAGACTCTTTCGTCTTCCACAGGACAATCA 
AAGAACGCAGCCACGCTTAGTCATATGGCTCAATGGGAGAGTGCAAGACTCGACGCTGAA 
GCAAGGCTAGCTAGAGAATCAAAGCTTCTCCATTTACAGCATTACCAAAACAATAACAAC 
CTTAACAAATCAGCAGCTCCTCAACAA(^TTGCTTCACTCAAAAAACATC^ACAAACTGG 
ACTAAACCAAACCAAGGAAACGGAGACCAACAGCTTGAATCTCCGACATCGACGGTGACA 
TTCTCTGAGAATCTTCTGATGCCTTTAGGAATCCCTACGGATAGCAGCAGAAATAGAAAC 
AATAACAACAATGAGTCCTCGGCGATGATTGAATTGGCCGTATCTTCGTCAACCTCCTCC 
GATGTGAGTCTGGTCAAAGAACATGAACACGACTGGATTAGGCAGATCAACTGTGGTAGT 
GGAGGAATAGGAGAAGGATTCACGAGTCTATTGATCGGTGATTCGGTCGGCCGGGGTTTA 
CCCACCGGGAAAAACGAAGCGACGGCGGGCGTGGGGAATGAGAGTGAGTATAACTACTAT 
GAGGATAACAAGAATTACTGGAATAGCATTCTCAACTTGGTTGATTCTTCACCGTCCGAT 
TCCGCGACGATGTTCTGA 

>G2426 Amino Acid Sequence (conserved domain in AA coordinates : 14-114) 

mgrspccdkaglkkgpwtpeedqkliayieehghgswrslpekaglqrcgkscrlrwtny 
lrpdi krgkftvqeeqti iqlhallgnrwsaiathlpkrtdne iknywnthlkkrlikmg 
idpvthkhknetlss stgqs knaatlshmaqwes arldaearlareskllhlqhyqnnnn 
lnksaapqqhcftqktstnwtkpnqgngdqqlesptstvtfsenlij^ 
n151^ssamieiiavssstssdvslvlcehehdwirqincgsggigegftslligdsvgrgl 

ptgkneatagvgneseynyyednkny™ 

>G1524 (1..825) 

atggggagaactaaggagcaggcaacattaactcggtatccaccctgtcctaggaatcct 
gctaaattcaatgatataaacaaagcactccaggaaaaaggatatggtaaggctctgaaa 
Xgaaaaccttggacgggtgtgacatgccctgtctgtcttgaggttcctcacaactcggtc 
gtcctcctttgttcatgttaccacaaaggatgccgtccgtacatgtgtgccacgggaaac 
cgtttctcaaattgtctagagcagtacaaaaaggcatatgccaaggatgagaaaagtgac 
aaaccgccagagctattgtgcccgctttgtaggggtcaggtgaaaggctggaccgttgtg 
gaaaaggaacgtaagtatctgaattctaagaaaaggtcatgcatgaacgacgagtgtttg 
ttttatggaagctatagacagctcaagaagcatgttaaggagaaccatccgagagccaag 
ccaagagccatagaccctgtgctggaggcgaaatggaagaagcttgaggttgagagggag 
aggagtgatgtaatcagcacagtcatgtcgtcaacacctggggctatggtatttggagac 
tatgtgattgagccatacaatggttatgatcatcaagatgacagtgacgattacagtgat 
tcgtcggatgacgaaatggaaggtggggtattcgagcttggagcattcgacctgggccgt 
cttcaaccgcgttcggctgccatctcaagccggggaattcgcggtatgatcataaggaac 
cggtgggctcgaagcagaggtgcgagcagaaggcgacaaacataa 

>G1524 Amino Aeid Sequence (conserved domain in AA coordinates : 49-110) 

MGRTKEQATLTRYPPCPRNPAKFOTINKALQEKGYGKALKRKPWTGVTCPVCLEVPHNSV 

VIjLCSSYHKGCRPYMCATGNRFSNCLEQYKKAYAKIDEKSDKPPELIjCPLCRGQVKGWTVV 

EKERKYLNSKKRS CM3TOECLFYGS YRQLKXHVKENHPRAKPRAIDPVIjEA 

RSDVISTVMSSTPGAIWFGDYVIEPYNGYDHQDDSDDYSDSSDDEMEGGVFELGAFDLGR 

LQPRS AAI S SRG I RGM 1 1 RNRWARS RGASRRRQT * 

>G1243 (1..3174) 

ATGGCGAGAAATTCGAATTCCGATGAGGCTTTCTCGTCAGAGGAGGAAGAAGAGCGGGTT 
AAGGATAATGAAGAAGAAGATGAGGAGGAGCTCGAGGCTGTTGCTCGTTCTTCTGGCTCC 
GACGATGACGAAGTAGCCGCCGCCGACGAATCACCAGTCTCCGACGGAGAGGCTGCTCCC 
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GTAGAAGATGATTACGAGGACGAAGAAGATGAGGAAAAAGCTGAAATCAGCAAACGTGAG 

AAAGCCAGACTTAAAGAGATGCAGAAGTTGAAGAAGCAGAAGATTCAAGAGATGCTGGAG 

TCGCAGAATGCTTCCATTGACGCGGATATGAACAATAAGGGAAAAGGGAGACTGAAGTAT 

CTTCTGCAGCAAACTGAGTTATTTGCCCACTTTGCTAAAAGTGATGGATCTTCTTCTCAG 

AAGAAGGCAAAAGGAAGGGGACGTCATGCTTCCAAAATAACTGAAGAGGAGGAAGACGAA 

GAGTATCTAAAGGAAGAAGAGGATGGCTTAACTGGATCTGGAAACACACGGTTACTCACA 

CAGCCCTCTTGTATTCAAGGGAAGATGAGAGATTACCAATTAGCTGGTTTGAACTGGCTC 

ATTCGTCTTTATGAGAATGGCATAAATGGAATTCTTGCTGATGAAATGGGTCTGGGGAAG 

ACGCTTCAAACGATTTCTTTGTTGGCATATCTTCATGAATACAGGGGAATCAATGGTCCC 

CATATGGTGGTTGCTCCAAAATCAACACTTGGTAATTGGATGAACGAAATTCGCCGGTTT 

TGTCCTGTCCTACGTGCTGTGAAGTTCCTTGGTAATCCTGAGGAGAGGAGACATATTCGA 

GAAGACCTGCTAGTTGCTGGGAAATTTGATATTTGTGTCACAAGCTTTGAGATGGCCATC 

AAAGAGAAGACAGCACTTCGTCGGTTTAGCTGGCGTTATATTATCATTGATGAAGCGCAT 

CGAATCAAGAACGAGAATTCACTCCTTTCTAAAACCATGAGACTTT^ 

CGGCTTOTTATCACGGGGACCCCCCTTCAGAATAATCTCCATGAACTGTGGGCTCTTCTA 

AATTTTOTTCTGCCTGAGATTTTTAGTTCAGCAGAGACTTTTGATGAATGGTTTCAAATT 

TCTGGTGAGAATGACCAGCAAGAAGTTGTGCAACAACTGC^CAAGGTTCTTCGACCATTT 

CTTCTTCGAAGACTAAAGTCAGATGTTGAGAAAGGTTTGCCACCGAAGAAGGAGACCATA 

CTTAA^GTTGGTATGTCTCAGATGCAAAAGCAATACTACAAGGCTTTACTGCAGAAGGAT 

CTTGAAGCGGTTAATGCTGGTGGAGAACGCAAACGTCTGCTAAACATTGCAATGCAACTG 

CGTAAATGCTGCAATCACCCCTATCTCTTCCAGGGTGCAGAACCTGGTCCCCCATATACC 

ACAGGAGATCACCTTATAACAAATGCTGGTAAGATGGTTCTCTTGGATAAATTGCTTCCT 

AAGTTGAAAGAACGTGATTCAAGGGTGCTGATATTTTCTCA^ 

ATTCTTGAGGACTATTTAATGTATCGTGGTTACTTGTATTGCCGTATTGATGGAAACACT 
GGTGGTGACGAACGAGATGCCTCCATAGAAGCCTACAACAAGCCAGGAAGTGAGAAATTT 
GTTTTCTTGTTATCTACTAGAGCTGGAGGGCTTGGTATCAATCTTGCTACTGCAGATGTT 
GTGATCCTTTACGATAGTGATTGGAACCCACAAGTCGACTTGCAAGCTCAGGATCGTGCC 
CATAGGATTGGTCAAAAAAAAGAAGTTCAAGTGTTTCGATTCTGCACTGAGTCTGCTATT 
GAGGAGAAAGTGATTGAAAGAGCTTACAAGAAGTTAG^^ 

CAAGGGAGATTGGCAGAAC^GAAAAGTAAGTCTGTCAATAAGGATGAGTTGCTTCAAATG 

GTAAGATATGGTGCTGAGATGGTGTTCAGTTCTAAAGATAGCACAATCACAGACGAGGAT 

ATTGATAGAATCATTGCCAAAGGAGAAGAGGCAACAGCTGAACTTGATGCTAAGATGAAG 

AAATTCACAGAAGATGCTATACAGTTTAAAATGGATGACAGTGCTGACTTCTATGATT^ 

GATGATGACAATAAGGATGAAAACAAGCTCGATTTTAAAAAGATTGTAAGCGACAATTGG 

AATGATCCCCCCAAGCGGGAGAGAAAGCGCAACTACTCTGAATCTGAGTACTTTAAGCAA 

ACATTGCGGCAAGGTGCTCCAGCTAAACCTAAAGAGCCTAGAATTCCGCGCATGCCCCAG 

TTGCACGATTTCCAGTTCTTTAACATTCAGAGATTGACCGAGTTGTATGAAAAGGAAGTA 

CGTTATCTCATGCAAACACATCAGAAAAATCAGTTGAAAGACACAATTGATGTTGAAGA^ 

CCAGAAGGTGGGGATCCCTTAACTACTGAAGAAGTAGAAGAAAAGGAGGGATTATTGGAG 

GAGGGTTTCTCAACATGGAG CAGAAGAGATTTTAATACTTTC CT CAGGGCTTGTGAGAAG 

TATGGCCGCAACGACATAAAAAGCATTGCCTCTGAGATGGAAGGGAAAACAGAGGAAGAA 

GTTGAAAGATATGCCAAAGTATTTAAAGAGCGGTACAAGGAGCTGAACGACTATGATAGA 

ATCATTAAGAACATTGAGAGGGGAGAGGCAAGGATCTCTAGGAAAGACGAAATCATGAAG 

GCCATAGGGAAGAAACTGGATCGCTACAGAAACCCTTGGCTGGAACTGAAGATTCAATAT 

GGTCAGAACAAAGGCAAGCTGTACAATGAAGAGTGTGACCGTTTCATGATCTGCATGATT 

CACAAACTTGGTTATGGGAATTGGGATGAGCTAAAGGCAGCATTTAGGACATCGTCTGTG 

TTCAGGTTTGACTGGTTTGTGAAATCCCGCACGAGTCAGGAACTTGCAAGAAGATGCGAC 

ACTCTGATTCGACTGATCGAGAAAGAGAACCAGGAGTTTGATGAAAGAGAGAGGCAAGCC 

CGCAAAGAGAAGAAGCTCGCGAAGAGTGCAACACCATCAAAGCGACCTTTAGGAAGACAA 

GCAAGTGAGAGTCCTTCATCGACGAAGAAGCGGAAGCACCTGTCGATGAGATGA 

>G1243 Amino Acid Sequence (domain in AA coordinates: 216-609) 

MARNSNSDEAFSSEEEEERVKDNEEEDEEELEAVARSSGSDDDEVAAADESPVSDGEAAP 

VEDDYEDEEDEEKAEI SKREKARLKEMQKLKKQKIQEMLESQNAS IDADMNNKGKGRLKY 

LLQQTELFAHFAKSDGSSSQKKAKGRGRHASKITEEEEDEEYLKEEEDGLTGSGNTRLLT 

QPSCIQGKMRDYQLAGLNWLIRLYENGINGILADEMGLGKTLQTISLLAYLHEYRGINGP 

HMWAPKSTLGNWMNEIRRFCPVLRAVK^^ 

KEKTALRRFSWRYI I IDEAHRIKNENSLLSKTMRLFSTNYRLLITGTPLQNNLHELWALL 
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NFLLPEIFSSAETFDEWFQISGENDQQEWQQLHKVLRPFLLRRLKSDVEKGLPPKKETI 

LKVGMSQMQKQYYKALLQKDLEAWAGGERKRLLNIAMQLRKCCNHPYLFQGAEPGPPYT 

TGDHLITNAGKMVLLDKLLPKLKERDSRVLIFSQMTRLLDILEDYLMYRGYLYCRIDGNT 

GGDERDAS I EAYNKPGS EKFVFLLSTRAGG LG INLATAD W I L YD SDWNPQVDLQAQDRA 

HRIGQKKEVQVFRFCTESAIEEKVIERAYKKLALDALVIQQGRLAEQKSKSVNKDELLQM 

VRYGAEMVFSSKDSTITDEDIDRIIAKGEEATAELDAKMKKFTEDAIQFKMDDSADFYDF 

DDDNKDENKLDFKJCIVSDNWNDPPKRERKRNYSESEYFKQTLRQGAPAKPKEPRIPRMPQ 

LHDFQFFNIQRLTELYEKEVRYLMQTHQKNQLKDTIDVEEPEGGDPLTTEEVEEKEGLLE 

EGFSTWSRRDFNTFLRACEKYGRNDIKSIASEMEGKTEEEVERYAKVFKERYKELNDYDR 

IIKNIERGEARISRKDEIMKAIGKKLDRYRNPWLELKIQYGQNKGKLYNEECDRFMICMI 

HKLGYGNWDELKAAFRTSSVFRFDWFVKSRTSQELARRCDTLIRLIEKENQEFDERERQA 

RKEKKLAKS ATPS KRPLGRQASES PS STKKRKHLSMR* 

>G631 (190.. 1461) 

CTTCTTCTTCTTCTTCTTCTTCTTCTTCCTCCTCTCTCGTCGGATCTCTCTGATTTAGTG 

ATTTTTCAT^TTTCAAGTTTTCTTCACCTTTAATTTTGTGTCTCGTTGATCTCTCTTTGG 

ACATTCTGCTTTGGATTCTGGAGGCTTCTCATTAGATCTCTATTAGTGGGTTTAGGTCAA 

GTTCTTGAAATGGATAAGGAGAAATCTCCTGCACCACCACCTAGTGGAGGTCTTCCTCCA 

CCATCGGGTCGTTACTCTGCGTTTTCACCTAATGGAAGTAGCTTTGCAATGAAAGCTGAA 

TCATCTTTTCCTCCTTTGACTCCAAGTGGAAGCAATAGCTCAGATGCTAACCGATTCAGC 

CATGATATTAGCCGAATGCCGGATAATCCACCTAAGAACCTAGGCCATCGCCGAGCTCAT 

TCAGAGATTCTTACTCTTCCTGATGACTTAAGCTTTGATAGTGATCTTGGTGTGGTTGGT 

GCTGCTGATGGACCTTCTTTCTCTGATGATACTGACGAGGACTTACTCTATATGTATCTT 

GATATGGAAAAATTCAATTCTTCTGCTACATCGACTTCTCAAATGGGTGAGCCATCAGAA 

CCGACTTGGAGGAATGAATTAGCCTCGACTTCTAACCTTCAGAGTACACCCGGTAGCTCT 

AGTGAAAGACCGAGAATTAGACACCAACACAGCCAATCGATGGATGGTTCAACAACTATC 

AAGCCTGAGATGCTTATGTCAGGGAATGAAGATGTGTCTGGAGTTGACTCTAAGAAAGCC 

ATCTCTGCTGCTAAACTTTCTGAGCTTGCTCTCATTGATCCAAAACGCGCCAAGAGGATA 

TGGGCAAACAGGC^GTCTGCTGCGAGGTCAAAAGAAAGGAAGATGAGATACATTGCAGAG 

CTCGAGAGAAAAGTACAGACTTTACAAACAGAGGCCACATCTCTCTCAGCCCAGTTGACT 

CTCTTACAGAGAGATACAAATGGCCTGGGTGTTGAAAACAATGAGCTTAAACTGCGAGTA 

CAAACTATGGAGCAACAGGTCCACCTACAGGATGCTTTAAATGATGCACTAAAGGAGGAA 

GTCCAGCATCTTAAGGTATTGACGGGGCAAGGTCCATCT^AATGGTACATCAATGAACTAC 

GGTTCTTTTGGATCAAACCAGCAATTCTATCCCAATAATCAGTCGATGCACACTATCTTA 

GCCGCACAACAGTTACAGCAGCTCCAGATCCAGTCACAGAAACAGCAACAACAAG^CAG 

CAACACCAGCAACAACAACAGCAGCAGCAGCAGCAATTTCACTTTCAACAGCAGCAACTG 

TACCAGCTTCAGCAGCAGCAACGGCTTCAACAACAGGAACAACAAAGCGGGGCTTCAGAG 

CTAAGAAGACCCATGCCTTCTCCTGGTCAGAAAGAGAGTGTGACATCGCCTGATCGTGAA 

ACTCCCTTGACAAAAGACTGAGTCTAGACTGTGCTAATGTCCAATTTAGTAAGTTACTCT 

TGGAAJ\ATCTTCTTTTTCATCGCAGGCTCATGGATTTGGGATTTACTGCATTATAGAGTT 

AAAAACAAGACAGCTTAGAAGTTGCGGATTTAGAAGTTGTTAGTGAAGCTTTTGTTCTCG 

TCTGTTGGTAGTTTACAATCTTCTCTTTGTATGATCCTAAG 

>G631 Amino Acid Sequence (domain in AA coordinates: TBD) 

MDKEKSPAPPPSGGLPPPSGRYSAFSPNGSSFAMKAESSFPPLTPSGSNSSDANRFSHDI 

SRMPDNPPKNLGHRRAHSEILTLPDDLSFDSDLGWGAADGPSFSDDTDEDLLYMYLDME 

KFNSSATSTSQMGEPSEPTWRNELASTSNLQSTPGSSSERPRIRHQHSQSMDGSTTIKPE 

MLMSGNEDVSGVDSKKAISAAKLSELALIDPKRAKRIWAN^ 

KVQTLQTEATSLSAQLTLLQRDTNGLGVEWELKLRVQTMEQQVHLQDALND 

LKVLTGQGPSNGTSMNYGSFGSNQQFYPNNQSMHTILAAQQLQQLQIQSQKQQQQQQQHQ 

QQQQQQQQQFHFQQQQLYQLQQQQRLQQQEQQSGASELRRPMPSPGQKESVTSPDRETPIi 

TKD* 

>G1909 (1..828) 

ATGGGTGGATCGATGGCGGAGAGAGCAAGGCAGGCCAACATTCCTCCACTAGCGGGACCC 
CTAAAGTGTCCTCGATGCGACTCCAGCAACACTAAGTTCTGTTACTACAACAACTATAAC 
CTCACTCAGCCTCGTCACTTCTGCAAAGGTTGCCGTCGCTACTGGACACAAGGGGGCGCC 
CTGAGAAACGTCCCTGTAGGTGGAGGCTGCCGGAGGAATAACAAGAAGGGCAAAAATGGA 
AATTTATUU^TCTTCTTCTTCTTCGTCCAAACAGTCTTCCTCGGTCAACGCTCAAAGTCCT 
AGCTCAGGACAGCTAAGGACAAATCATCAGTTCCCTTTTTCACCAACTCTTTACAATCTC 
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ACTCAACTCGGAGGTATTGGTTTGAACTTAGCCGCTACTAATGGCAACAACCAAGCTCAC 
CAGATCGGTTCCAGTTTGATGATGAGCGATCTAGGGTTTCTCCATGGACGAAATACTTCA 
ACTCCGATGACGGGAAACATTCATGAAAACAACAACAATAATAACAATGAAAACAACCTA 
ATGGCATCCGTTGGATCTTTGAGCCCCTTTGCTCTCTTCGATCCAACGACGGGGCTATAC 
GCTTTCCAGAACGACGGTAATATCGGGAACAACGTTGGGATATCTGGTTCTTCTACTTCC 
ATGGTTGATTCTAGGGTTTATCAGACGCCTCCGGTGAAGATGGAAGAACAACCTAATTTG 
GCTAACTTGTCTAGACCGGTCTCCGGTTTGACGTCTCCTGGGAATCAAACAAATCAGTAC 
TTTTGGCCTGGTTCGGATTTCTCGGGTCCTTCTAATGATCTCTTGTGA 

>G1909 Amino Acid Sequence (conserved domain in AA coordinates : 23-51) 
MGGSMAERARQANIPPLAGPLKCPRCDSSNTKFCYY^ 

LRNVPVGGGCRRNNKKGKNGNLKS S S SSSKQS SSVNAQSPSSGQLRTNHQFPFS PTLYNL 
TQLGGIGLNLAATNGmQAHQIGSSLMM 

MASVGSLSPFALFDPTTGLYAFQNDGNIGNNVGISGSSTSIWDSRVyQTPPVKMEEQPNL 

ANLSRPVSGLTSPGNQTNQYFWPGSDFSGPSNDLL* 

>G1663 (64.. 630) 

TTCTCTCTGTGAATCCTTGTTCATCGTC^CTGAAATTAGTTTACAAAATCGACGAATTCG 

GAGATGATTTTTCAGAATGTGTGCAGAAATGAGTCCAACTTCAACGCTATAGCTTCCGAA 

TCGCGTTCCCAAACGCAGTTCGGTGTTTCGAAATCCTCCTCGAGCGGCGGCGGATGTATC 

TCCGCCAGGACTAAAGACCGTCACACGAAGGTTAACGGACGAAGCCGTCGAGTTACGATG 

CCGGCTCTCGCCGCCGCTAGGATTTTCCAGTTAACGCGTGAGCTCGGTCACAAAACTGAA 

GGAGAAACCATCGAATGGCTTCTTAGTCAAGCTGAACCGTCGATTATTGCCGCCACTGGC 

TACGGGACTAAGCTCATTTCGAATTGGGTTGATGTTGCGGCGGACGATTCCTCGTCGTCG 

TCGTCGATGACGTCGCCGCAAACGCAAACGCAAACGCCACAATCGCCGAGTTGTAGGTTG 

GATCTTTGTCAGCCAATCGGAATTCAGTATCCGGTGAATGGTTACAGTCATATGCCGTTC 

ACAGCGATGCTTTTAGAGCCGATGACCACGACGGCGGAATCTGAGGTTGAGATCGCGGAG 

GAGGAGGAACGTAGACGCCGTCACCATTAGTAAAATTAGGCTTTTGATTTAGAGTGTTAA 

AATTAGGATTTTAAAAGTTTAGGAGGTAACAGATAAGGATAATT 

>G1663 Amino Acid Sequence (domain in AA coordinates: TBD) 

MI FQNVCRHESNFNAI ASESRS QTQFGVS KS S S SGGGC I S ARTKDRHTKVNGRSRRVTMP 

ALAAARIFQLTRELGHKTEGETIEWLLSQAEPSIIAATGYGTKLISNWVDVAADDSSSSS 

SMTSPQTQTQTPQSPSCRLDLCQPIGIQYPVNGYSHMPFTAMLLEPMTTTAESEVEIAEE 

EERRRRHH* 

>G1231 (103.. 870) 

CAAACCGAAATTCTCTCAGCGCCGGTCAAATACTTGTCTCTCTCTCTCTCTCTCTTTCAC 
TCTTGTCTTGTCTCCTTCGAAGCTGTTTGTTCTGTAAGAAAGATGGAAGCAGGTGGCGCG 
TAC7UVTCCACGCACTGTTGAAGAGGTGTTTAGGGATTTTAAGGGTCGTAGAGCTGGCATG 
ATTAAGGCTTTAACCACTGATGTTCAGGAGTTTTTCCGACTTTGTGATCCCGAAAAGGAG 
AACCTTTGCCTTTACGGACATCCAAATGAGCACTGGGAAGTGAATTTGCCAGCTGAAGAG 
GTTCCTCCTGAGCTCCCAGAGCCTGTCTTGGGTATCAATTTTGCCAGAGACGGGATGGCG 
GAAAAGGATTGGTTGTCCCTTGTTGCTGTCCACAGTGATGCTTGGCTTCTTGCTGTTGCT 
TTCTTTTTTGGAGCCAGGTTTGGATTTGACAAAGCTGATAGGAAGAGGCTTTTCAATATG 
GTGAATGACCTCCCAACAATCTTTGAGGTTGTAGCTGGCACTGCTAAGAAACAAGGAAAA 
GATAAGTCCTCTGTTTCCAACAACAGCAGCAACAGATCCAAATCAAGCTCCAAGCGAGGA 
TCTGAATCCCGTGCCAAGTTCTCAAAGCCGGAGCCCAAAGATGATGAGGAGGAGGAAGAG 
GAAGGTGTGGAAGAGGAGGATGAGGATGAGCAAGGTGAAACACAGTGTGGAGCATGTGGT 
GAGAGCTATGCAGCTGATGAGTTCTGGATTTGCTGTGACCTCTGTGAGATGTGGTTTCAT 
GGAAAGTGTGTTAAGATAACACCAGCAAGAGCTGAGCACATCAAGCAATACAAGTGCCCT 
TCTTGCAGCAACAAAAGGGCTCGTTCCTAAATTTGTTGACCGCTCGCTTCTGTGTATCTA 
CCTTTGCATATGATGATGAACAGCTTAACTGTTTGGTTTAGATCAGATTTGTCATATGGA 
TTTGGTAATTTAGGAAGACATTTTAGTTTTTTCATTGTTACATTTTGGCGATTGAAGGGA 
TAACTCTTTGTTTAGGGGTAATGATCTTTTGCTCTGTTTTATGTTTGTTTATTAACATTC 
TTCAAACTCAATCAAAAGTATTTTGGTTAGTCTTAAAA 

>G1231 Amino Acid Sequence (domain in AA coordinates: TBD) 

MEAGGAYNPRTVEEVFRDFKGRRAGMIKALTTDVQEFFRLCDPEKENLCLYGHPNEHWEV 

NLPAEEVPPELPEPVLGINFARDGMAEKDWLSLVAV^ 

KKLFNMVNDLPTIFEWAGTAKKQGKDKSSVSNNSSNRSKSSSKRGSESRAKFSKPEPKD 
DEEEEEEGVEEEDEDEQGETQCGACGESYAADEFWICCDLCEMWFHGKCVKITPARAEHI 
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KQYKCPSCSNKRARS* 
>G227 (21.. 983) 

GTACCGTCGACGATCCGGCGATGTCAAACCCGACCCGTAAGAATATGGAGAGGATTAAAG 
GTCCATGGAGTCCAGAAGAAGATGATCTGTTGCAGAGGCTTGTTCAGAAACATGGTCCGA 
GGAACTGGTCTTTGATTAGCAAATCAATCCCTGGACGTTCCGGCAAATCTTGTCGTCTCC 
GGTGGTGTAACGAGCTATCTCCGGAGGTAGAGCACCGTGCTTTTTCGCAGGAAGAAGACG 
AGACGATTATTCGAGCTCACGCTCGGTTTGGTAACAAGTGGGCTACGATCTCTCGTCTTC 
TCAATGGACGAACCGATAACGCTATCAAGAATCATTGGAACTCGACGCTGAAGCGAAAAT 
GCAGCGTCGAAGGGCAAAGTTGTGATTTTGGTGGTAATGGAGGGTATGATGGTAATTTAG 
GAGAAGAGCAACCGTTGAAACGTACGGCGAGTGGTGGTGGTGGTGTCTCGACTGGCTTGT 
ATATGAGTCCCGGAAGTCCATCGGGATCTGACGTCAGCGAGCAATCTAGTGGTGGTGCAC 
ACGTGTTTAAACCAACGGTTAGATCTGAGGTTACAGCGTCATCGTCTGGTGAAGATCCTC 
CAACTTATCTTAGTTTGTCTCTTCCTTGGACTGACGAGACGGTTCGAGTCAACGAGCCGG 
TTCAACTTAACCAGAATACGGTTATGGACGGTGGTTATACGGCGGAGCTGTTTCCGGTTA 
GAAAGGAAGAGCAAGTGGAAGTAGAAGAAGAAGAAGCGAAGGGGATATCTGGTGGATTCG 
GTGGTGAGTTCATGACGGTGGTTCAGGAGATGATAAGGACGGAGGTGAGGAGTTACATGG 
CGGATTTACAGCGAGGAAACGTCGGTGGTAGTAGTTCTGGCGGCGGAGGTGGCGGTTCGT 
GTATGCCACAAAGTGTAAACAGCCGTCGTGTTGGGTTTAGAGAGTTTATAGTGAACCAAA 
TCGGAATTGGGAAGATGGAGTAGGCGGCC 

>G227 Amino Acid Sequence (domain in AA coordinates: 13-112) 

msnptrkl^erikgpwspeeddllqrlvqkhgprnwslisksipgrsgkscrlrwcnqls 
pevehrafsqeedetiiraharfgnkwatisrllngrtdnaiknh™ 

CDFGGNGGYDGNLGEEQPLKRTASGGGGVSTGLYMSPGSPSGSDVSEQSSGGAHVFKPTV 

RSEVTASSSGEDPPTYLSLSLPWTDETVRVNEPVQLNQNTVMDGGYTAELFPVRKEEQVE 

VEEEEAKGISGGFGGEFMTWQEMIRTEVRSYMADLQRGNVGGSSSGGGGGGSCMPQSVN 

SRRVGFREFIVNQIGIGKME* 

>G1842 (219.. 809) 

ACTATTA(^TGCCTCTTCCTCGCTTCAAAACGGCACCGTTTCCACTTGTTATTATTTTTC 

TCTCTATCGTCTAACAAAAAAAAAAACTGACTTGGGATTTTTTTTCATTTGT 

AAAGAAGAAGATAGAAACGAAGAAAAAAAGCAAACACATTTTGGGTCCCCGGTGGTTAGG 

ATCAAATTAGGGCACAAACCTTATCGGAGAAAGAAGCCATGGGAAGAAGAAAAGTCGAGA 

TCAAGCGAATCGAGAACAAAAGCAGTCGACAAGTCACTTTCTCCAAACGACGCAAAGGTC 

TCATCGAAAAAGCTCGACAACTTTCAATTCTCTGTGAATCTTCCATCGCTGTTGTCGCCG 

TCTCCGGTTCCGGAAAACTCTACGACTCTGCCTCCGGTGACAACATGTCZAAAGATCATTG 

ATCGTTATGAAATACATCATGCTGATGAACTTAT^AGCCTTAGATCTTGCAGAAAAAATTC 

GGAATTATCTTCCACACAAGGAGTTACTAGAAATAGTC CAAAG C AAG CTTGAAGAATCAA 

ATGTCGATAATGTAAGTGTAGATTCTCTAATATCTATGGAGGAACAGCTCGAGACTGCTC 

TGTCAGTAATTAGAGCTAAGAAGACAGAACTAATGATGGAGGATATGAAGTCACTTCAAG 

AAAGGGAGAAGTTGCTGATAGAAGAGAACCAGATTCTGGCTAGCCAGGTGGGGAAGAAGA 

CGTTTCTGGTTATAGAAGGTGACAGAGGAATGTCACGGGAAAATGGCTCCGGCAACAAAG 

TACCGGAGACTCTTTCGCTGCTCAAGTAATCACCATCATCAACGGCTGAGCTTTCACCAT 

AAACTTACTCACAGCCTGATTCAGAAGCTTTTACAAAATTGTAAATTATAAAAAGCTGCA 

TAATAATCTCAACCTTTTTATCTTCCTCGCGCCAATGTGGAAATAT^AGGTAAAACAAAAC 

GAAGCTCTTTTCTTTTATGCGAAAGAATTGTAAAACTAAGATAAAGCTACCGATCTTTGT 

TGTACCTTAGTAGACAAATATCAGAGTTCTTGTGCTTGT 

>G1842 Amino Acid Sequence (domain in AA coordinates: 2-57) 
MGRRKVEIKRIENKSSRQVTFSKRRKGLIEKARQLSILCESSIAVVAVSGSGKLYDSASG 

DNMSKI IDRYEIHHftDELKALDI^ 

EEQLETALSVIRAiCKTELMMEDMKSLQBREKLLIEENQILASQVGKKTFLVIEGDRGMSR 

ENGSGNKVPETLSLLK* 

>G1505 (1..681) 

ATGGATGATATAGCGGAACTTGAATGGTTATCAAATTTCGTAGATGATTCTTCTTTCACG 
CCGTATTCTGCTCCGACGAATAAACCGGTTTGGTTAACCGGAAATCGGAGACATCTTGTA 
CAACCGGTTAAAGAGGAGACCTGCTTCAAATCCCAACATCCGGCCGTCAAAACCAGACCC 
AAACGAGCCAGAACCGGAGTCAGAGTCTGGTCTCATGGTTCGCAGTCGTTAACCGACTCA 
TCTTCAAGCTCTACAACATCTTCGTCGTCCTCTCCTCGTCCTTCAAGCCCTCTATGGCTC 
GCCAGCGGTCAGTTTCTTGATGAGCCAATGACTAAAACACAAAAGAAGAAGAAAGTTTGG 
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AAAAACGCTGGTCAGACGCAAACGCAAACGCAGACGCAGACGCGGCAGTGTGGTCATTGT 
GGAGTTCAGAAAACGCCGCAGTGGAGAGCAGGACCATTAGGAGCGAAGACGTTGTGTAAT 
GCGTGTGGTGTGCGTTACAAATCGGGTCGGTTACTACCCGAATATAGACCCGCTTGTAGC 
CCAACATTTTCGAGTGAGCTTCACTCAAACCACCACAGTAAAGTCATTGAGATGCGTAGG 
AAGAAAGAGACTTCTGACGGTGCTGAAGAAACCGGTTTGAACCAGCCGGTTCAGACGGTT 
CAGGTTGTCTCGAGTTTTTGA 

>G1505 Amino Acid Sequence (domain in AA coordinates: TBD) 

lyTODIAELEWLSNFVDDSSFTPYSAPTNKPVWLTGNRRHLVQ 

KRARTGWWSHGSQSLTDSSSSSTTSSSSSPRPSSPLWLAS 

KNAGQTQTQTQTQTRQCGHCGVQKTPQWRAGPLGAKTLCNACGVRYKSGRLLPEYRPACS 
PTFS S ELHSNHHS KVI EMRRKKETSDGAEETGLNQP VQTVQ WS S F * 
>G657 (1..2331) 

ATGAAGCGTGAGATGAAAGCACCTACTACTCCACTAGAGAGTCTCCAAGGTGACCTCAAA 
GGAAAACAAGGGAGGACATCTGGCCCTGCTAGACGATCTACCAAAGGACAATGGACACCT 
GAAGAGGACGAAGTCTTGTGTAAAGCTGTTGAGCGTTTTCAAGGAAAGAACTGGAAGAAG 
ATAGCTGAATGTTTTAAGGATCGGACTGATGTTCAGTGTCTTCATAGATGGCAAAAGGTC 
TTGAACCCAGAGCTTGTGAAAGGACCGTGGTCAAAAGAGGAGGATAACACAATAATTGAC 
CTGGTTGAAAAATATGGGGCAAAGAAATGGTCTACTATATCTCAGC^TTTACCTGGGCGC 
ATAGGAAAGCAATGTAGGGAAAGGTGGCATAACCATCTTAACCCTGGGATTAATAAAAAT 
GCATGGACTCAGGAAGAGGAACTGACTCTTATTCGTGCGCATCAAATTTATGGGAATAAA 
TGGGCAGAGCTTATGAAATTTTTGCCAGGAAGGTCAGATAATTCGATAAAAAATCATTGG 
AACAGCTCAGTTAAGAAGAAGTTGGATTCCTACTATGCATCAGGTCTTTTAGATCAGTGT 
CAAAGCTCGCCATTAATTGCCCTTCAGAACAAATCTATCGCTTCATCTTCCTCGTGGATG 
CACAGCAATGGAGATGAAGGTAGTTCAAGGCCAGGGGTTGATGCTGAGGAATCAGAATGC 
AGCCAAGCTTCAACTGTTTTCTCAC^^ 

GGAAATGAGGAATATTACATGCCTGAATTTCATTCAGGAACGGAGCAGCAAATCTCAAAC 

GCTGCATCTCATGCAGAACCGTACTACCCTTCCTTTAAAGATGTCAAAATTGTTGTCCCC 

GAAATTTCTTGTGAAACAGAATGTTCGAAGAAGTTTCAGAATCTTAATTGTTCTCACGAG 

CTAAGAACTACCACAGCTACGGAGGATCAATTGCCGGGTGTATCTAATGATGCTAAACAG 

GACCGTGGTCTAGAGTTATTGACCCATAACATGGACAACGGTGGAAAAAACCAAGCACTT 

C^CAAGATTTTCAAAGTTC^GTAAGATTAAGTGATC^CCTTTTTTGTCAAACTCGGA^ 

ACAGATCCAGAAGCTCAAACTTTGATCACGGATGAGGAGTGTTGTAGGGTTCTTTTTCCA 

GATAACATGAAAGATAGCAGTACATCTTCTGGTGAGCAAGGTCGGAATATGGTTGACCCT 

CAAAACGGCAAAGGATCTCTTTGTTCTCAGGCTGCAGAAACCCATGCTCATGAAACTGGA 

AAAGTTCCAGCTTTACCGTGGCATCCTTCAAGTTCTGAGGGCCTGGCGGGTCATAATTGT 

GTCCCTTTGTTGGATTCAGACTTGAAGGACTCACTTTTACCCCGTAATGATTCCAACGCT 

CCTATACAAGGTTGTCGCCTTTTTGGAGCTACCGAATTAG7\ATGTAAGACTGATACAAAT 

GACGGTTTCATCGATACTTACGGACATGTAACTTCCCATGGCAATGATGATAATGGTGGT 

TTCCCAGAACAACAGGGGCTGTCATATATTCCCAAGGATTCTTTGAAGCTAGTACCTTTG 

AATAGTTTTTCTTCTCCTTCTAGAGTGAACAAGATTTATTTTCCTATTGACGATAAGCCG 

GCTGAAAAAGACAAAGGAGCTCTTTGTTATGAACCTCCACGTTTTCCAA^TGCAGATATT 

CCTTTCTTCAGCTGTGATCTTGTACCATCAAATAGTGACTTACGGCAAGAGTACAGTCCC 

TTTGGTATCCGTCAGTTGATGATTTCTTCAATGAATTGTACAACTCCGTTAAGGTTATGG 

GATTCACCGTGTCACGATAGGAGCCCTGATGTCATGCTTAATGATACTGCCAAAAGTTTT 

AGTGGTGCACCATCCATCTTAAAGAAGCGGCATCGAGACTTGCTTTCACCTGTGCTTGAT 

AGAAGAAAAGACAAAAAGCTTAAAAGGGCTGCGACTTCCTCCTTGGCTAATGATTTTTCG 

CGCTTAGATGTAATGCTTGATGAAGGAGATGATTGCATGACCTCTCGTCCGTCAGAGTCT 

CCTGAAGATAAAAA5ATATGTGCCTCCCCTTCCATAGCCAGAGATAACAGAAATTGTGCA 

TCAGCTCGGTTATATCAAGAAATGATTCCGATAGATGAGGAACCAT^AGGAAACCTTAGAA 

TCAGGTGGAGTGACTTCTATGCAAAATGAAAATGGATGTAATGACGGTGGTGCTTCAGCT 

AAAAATGTAAGTCCGTCTTTGTCCTTGCATATTATCTGGTATCAGTTATAA 

>G657 Amino Acid Sequence (domain in AA coordinates: TBD) 

MKREMKAPTTPLESLQGDLKGKQGRTSGPARRSTKGQWTPEEDEVLCKAVERFQGK1JWKK 

IAECFKDRTDVQCLHRWQKVLNPELVKGPWSKEEDNTIIDLVEKYGPKKWSTISQHLPGR 

IGKQCRERWffiraiiNPGINKNAWTQEEELTLlRAHQIYGl^AELMKFLPGRSDNSIKl^ 

NSSVKKKLDSYYASGLLDQCQSSPLIALQNKSIASSSSWMHSNGDEGSSRPGVDAEESEC 

SQASTVFSQSTNDLQDEVQRGNEEYYMPEFHSGTEQQISNAASHAEPYYPSFKDVKIVVP 
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EISCETECSKKFQNl^CSHELRTTTATEDQLPGVSNDAKQDRGLELLTHNMDNGGKNQAL 
QQDFQSSVRLSDQPFLSNSDTDPEAQTLITDEECCRVLFPDNMKDSSTSSGEQGRNMVDP 
QNGKGSLCSQAAETHAHETGKVPALPWHPSSSEGIAGHNCVPLLDSDLKDSLLPRNDSNA 
PIQGCRIiFGATELECKTDTNDGFIDTYGHVTSHGNDDNGGFPEQQGLSYIPKDSLKLVPL 
NSFSSPSRVNKIYFPIDDKPAEKDKGALCYEPPRFPSADIPFFSCDLVPSNSDLRQEYSP 
FGIRQLMISSMNCTTPLRLWDSPCHDRSPDVMLNDTAKSFSGAPSILKKRHRDLLSPVLD 
RRKDKKLKRAATS SLANDFSRLDVML^ 

SARLYQEMIPIDEEPKETLESGGVTSMQNENGCNDGGASAKNVSPSLSLHIIWYQL* 
>G1959 (141.. 1028) 

CGTCGACTGTCCATAAATCCGGAGCCTGACCCGACGTTTGACCCGGATCCGAAACTCCCA 
CAATCTCCATACCACCCAAATTCATCTCCCCTAAAGCTTTCTCTCACTTTCCCGGGAAAA 
TCGGCGACCAAAATTGGAAAATGTACTCAGCGATTCGCTCGCTTCCACTCGATGGTGGAC 
ACGTTGGTGGTGACTACCATGGACCTCTTGACGGAACGAATCTTCCCGGTGACGCTTGTT 
TGGTTTTAACGACTGACCCTAAACCTCGTCTCCGGTGGACAACTGAGCTTCATGAGAGAT 
TCGTTGACGCCGTTACTCAGCTCGGTGGTCCTGACAAAGCGACTCCCAAAACTATTATGA 

gaac^tgggagtgaagggtctcactctctaccacctcaaatc^catcttcagaaattcc 
gcctagggaggcaagctggcaaagaatcaactgagaactctaaagatgcttcttgtgtag 
gggagagtcag g acacaggtt catcttcgacatcatcaatgagaatggcgcag caggagc 
agaacgagggttaccaagtcaccgaagotctacgtgctcagatggaagtccaaagaagac 
tacacgatcaattggaggtgcaacggaggctccagctgaggatagaggcacaaggaaaat 
acctgcaatcgattcttgaaaaagcttgcaaggcctttgacgagc^gcikx:tacttttg 

CTGGACTTGAGGCTGCTAGGGAAGAGCTATCAGAGCTAGCCATCAAAGTCTCCAATAGCT 
CTCAAGGAACATCAGTCCCGTACTTCGATGCAACAAAGATGATGATGATGCCATCGTTGT 
CAGAGCTTGCAGTAGCAATAGAC^CAAAAACAACATCACAACCAACTGTTCAGTAGAAA 
GCTCTCTGACTTCCATCACACATGGGAGCTCTATATCTGCTGCATCAATGAAGAAGCGTC 
AACGTGGAGACAATTTGGGCGTAGGGTATGAATCAGGCTGGATTATGCCTAGTAGCACCA 
TTGGATAAAGTTTAGGAGAGGGAAAAAGTTCATTATGGGAAAGGTAGAGATAAGATTTAA 
CTGTTCTTTACTTGCTTTGAGGGGCCTGCGGCCGCT 

>G1959 Amino Acid Sequence (conserved domain in AA coordinates : 46-S7) 

MYSAIRSLPLDGGHVGGDYHGPLDGTNLPGDACLVIjTTDPKPRLRWTTELHERFVDAVTQ 

LGGPDKATPKTIMRTMGVKGLTLYHLKSHLQKFRLGRQAGKE S TENS KDAS CVGESQDTG 

SSSTSS^^mAQQEQNEGYQWEALRAQMEVQRRLHDQLEVQRRLQLRIEAQGKYLQSILE 

KACKAFDEQAATFAGLEAAREELSELAIKVSNSSQGTSVPYFDATK1#1MMPSLSELAVA 

DNKNNITTNCSVBSSLTSITHGSSISAASMKKRQRGDNLGVGYESGWIMPSSTIG* 

>G2180 (1..1440) 

ATGGCTCCTGTCTCGTTACCTCCAGGTTTCCGATTCCATCCAACAGACGAGGAACTAATT 
ACTTACTATCTAAAAAGAAAGATCAACGGTCTAGAAATCGAACTTGAAGTTATCGCTGAA 
GTTGATCTTTACAAGTGTGAGCCATGGGACTTACCAGGGAAGTCCTTGCTTCCGAGCAAA 
GACCAAGAATGGTACTTCTTCAGCCCACGAGACCGGAAGTATCCCAACGGCTCAAGGACA 
AACCGGGCAACTAAAGGCGGTTATTGGAAGGCTACAGGTAAAGACCGCCGAGTTAGTTGG 
AGAGACCGAGCCATAGGAACCAAGAAGACATTGGTTTACTACCGTGGGCGCGCGCCACAT 
GGCATAAGAACTGGTTGGGTCATGC71CGAATATCGACTTGATGAAACAGAATGTGAGCCT 
TCTGCATACGGCATGCAGGACGCATATGCACTTTGTCGTGTGTTCAAAAAGATTGTTATT 
GAAGCTAAGCCAAGAGATCAACATCGGTCATATGTCCACGCGATGTCGAATGTGAGTGGT 
AATTGCTCATCGAGTTTTGACACTTGTTCGGATCTCGAAATCAGTTCAACTACTCATCAA 
GTTCAAAACACATTCCAACCGCGATTTGGCAACGAGCGATTTAACTCCAACGCAATCAGC 
AACGAGGATTGGTCACAATACTACGGTTCTTCTTATAGACCGTTCCCTACTCCATATAAG 
GTTAA(^CAGAGATCGAATGTT(^TGTTACAACACAATATATATCTACCACCGTTGCGT 
GTAGAGAACTCTGCGTTTAGTGATTCCGATTTCTTCACGAGTATGACTCACAACAACGAC 
CATGGCGTTTTCGATGACTTTACTTTTGCTGCAAGTAACTCCAACCACAATAATAGCGTT 
GGTGATCAAGTGATCCACGTTGGCAATTATGATGAACAATTAATAACATCTAACCGTCAT 
ATGAACCAGACTGGTTATATAAAAGAGCAGAAGATCAGATCGAGTTTGGATAATACTGAC 
GAAGATCCAGGATTTCATGGTAACAATACCAATGACAACATAGATATCGATGATTTTCTC 
TCGTTTGATATATATAACGAGGACAACGTGAATCAAATAGAAGATAATGAAGACGTGAAT 
ACAAATGAAACCCTTGATTCATCGGGATTCGAGGTGGTTGAAGAAGAAACTAGATTTAAC 
AACCAAATGCTCATCTCGACATATCAAACGACAAAGATTCTATATCACCAAGTCGTACCT 
TGTCACACGTTGAAAGTTCACGTCAATCCTATTAGTCACAATGTGGAAGAGAGAACATTG 
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TTCATTGAAGAGGACAAAGATTCTTGGTTACAAAGAGCTGAGAAGATCACGAAGACAAAA 
CTAACACTTTTTAGTTTAATGGCTCAGCAATACT 

>G2180 Amino Acid Sequence (conserved domain in AA coordinates: 7-156) 

MAPVSLPPGFRFHPTDEELITYYLKRKINGLEIELEVIAEVDLYKCEPWDLPGKSLLPSK 

DQEWYFFSPRDRKYPNGSRTNRATKGGYWKATGKDRRVSWR^ 

GIRTGWVMHEYRLDETECEPSAYGMQDAYALCRVTKKIVIEAKPRDQ 

NCSSSFDTCSDLEISSTTHQVQNTFQPRPGNERFNSNAISNEDWSQYYGSSYRPFPTPYK 

WTEIECSMLQHNIYLPPLRvl^SAFSDSDFFTSMTHl^ 

GDQVIHVGNYDEQLITSNRHMNQTGYIKEQKIRSSLDNTDEDPGFHGNNTNDNIDIDDFL 
SFDIYNEDNWQIEDNEDWTl^TLDSSGFEVV^EETRFNNQMLISTYQTTKILYHQWP 
CHTLKVHVNPISHNV^ERTLFIEEDKDSWLQRAEKITKTKXTLFSLMAQQYYKCIA 
>G1817 (1..1308) 

ATGAAGGACGCAGAGAAGCGAGAGGTGATTGCATCATCATCATTACAAAGAAAGAGAAAC 
AGAGGAAGAAGACTAAGGAAAAGAAGAAGAAGAAACGAGAAGCGAGTACTAATGGTTCCA 
TC^TCATTACCAAACGACGTGCTAGAGGAGATCTTTTTAAGAXTTCCGGTTAAAGCCC^^ 
ATCCGACTCAAGTCTCTCTCGAAACAATGGAGATCGACGATCGAATCTCGCAGTTTTGAA 
GAGAGACACTTGACGATCGCTAAGAAAGCCTTCGTGGATCATCCCAAGGTCATGCTCGTA 
GGAGAAGAAGATCCCATAAGAGGAACCGGGATTCGTCCAGACACTGACT^TTGGTTTTAGG 
TTATTCTGCTTGGAATCGGCTTCTCTTCTATCCTTTACTCGTCTCAATTTCCCTCAAGGG 
TTCTTCAACTGGATCTACATATCTGAA^ 

AAATCACATTCCGTATATGTAGTGAATCCGGCTACACGGTGGCTCCGCCTACTTCCTCCG 
GCAGGGTTTCAGATTTTGATCCACAAGTTTAACCCCACTGAACGTGAGTGGAATGTAGTG 
ATGAAATCAATCTTTCATCTAGCATTCGTGAAGGCCACCGATTACAAATTAGTGTGGTTG 
TACAATTGTGATAAGTACATTGTTGATGCGTCGAGTCCAAACGTGGGAGTCACAAAGTGC 
GAGATTTTTGACTTTAGGAAAAATGCTTGGAGGTACTTGGCTTGCACTCCAAGTCATCAG 
ATATTCTATTACCAAAAGCCAGCATCTGCAAACGGGTCGGTTTATTGGTTTACAGAACCA 
TATAATGAAAGAATCGAAGTAGTGGCTTTTGATATTCAGACCGAAACATTCCGGTTGCTG 
CCTAAGATTAATCCGGCTATTGCTGGTTCAGATCCTCACCATATTGACATGTGCACTCTG 
GATAATAGTTTGTGTATGTCGAAAAGGGAGAAAGATACTATGATCCAAGATATTTGGAGG 
TTGAAACCATCAGAAGACACATGGGAAAAGATTTTTAGCATAGACTTGGTTTCCTGTCCT 
TCTTCTCGGACTGAGAAGCGTGATCAATTTGATTGGAGCAAGAAGGATAGGGTTGAGCCA 
GCCACACCCGTCGCGGTTTGTAAGAATAAGAAGATCCTTCTCTCACATCGCTATTCCCGA 
GGTTTGGTAAAGTACGATCCCCTT^CAAAATCTATCGATTTTTTTTCCGGACATCCTACC 
GCTTACAGAAAAGTTATTTATTTTCT^AAGTTTGATATCTCATCTATAA 

>G1817 Amino Acid Sequence (conserved domain in AA coordinates : 47-331) 

MKDAEKREVIASSSLQRKRNRGRRLRKRRRRNEKRVT.MVTSSLPN^ 

IRLKSLSKQWRSTIESRSFEERHLTIAKKAFVX)HPKVMLVGEEDPIRGTGIRPDTDIGFR 

LFCLESASLLSFTRLNFPQGFFlTWIYISESCDGLFCIHSPKSHSVYvWPATRWLRLLPP 

AGFQI L IHKFNPTEREWNWMKS I FHLAFVKATDYKLVWLYNCDKYI VDAS S PNVGVTKC 

EIFDFRKNAWRYLACTPSHQIFYYQKPASANGSVYWFTEPYNERIEWAFDIQTETFRLL 

PKINPAIAGSDPHHIDMCTLDNSLCMSKREKDTMIQDIWRLKPSEDTWEKIFSIDLVSCP 

SSRTEKRDQFDWSKKDRVEPATPVAVCKNKKILLSHRYSRGLVKYDPLTKSIDFFSGHPT 

AYRKVIYFQSLISHL* 

>G1649 (61.. 1311) 

ATTCACAAAAACCGGAAAAAAAAAAAGACAAGTAAAGAAAGCTTTGTTCAGTTTACTTCA 
ATGGAAGCAAAACCCTTAGCATCATCATCATCTGAACCAAACATGATTTCTCCATCATCA 
AACATTAAACCAAAATTAAAAGATGAAGATTATATGGAGCTGGTGTGTGAAAATGGGCAG 
AITCTTGCAAAGAT^CGAAGACCAAAGAACAACGGTTCTTTTCAAAAGCAACGTAGGC^ 
TCTCTCCTGGATTTGTATGAGACCGAGTACAGCGAGGGTTTCAAGAAAAACATCAAGATT 
CTTGGAGACACACAAGTTGTTCCGGTGAGTCAGTCTAAGCCACAACAAGATAAAGAAACC 
AATGAACAAATGAACAACAATAAGAAGAAGCTAAAGTCCTCCAAAATCGAATTTGAGAGA 
AATGTTTCGAAAAGCAACAAATGTGTTGAATCATCAACATTAATTGATGTTTCTGCTAAA 
GGTCCAAAGAATGTTGAAGTTACTACAGCTCCTCCTGATGAGCAATCTGCAGCTGTTGGT 
AGATCCACGGAATTGTATTTTGCTTCTTCATCGAAGTTTTCTCGAGGAACTTCGAGAGAT 
CTAAGTTGTTGTTCTTTAAAGAGGAAGTATGGAGATATTGAAGAAGAAGAATCAACCTAT 
TTAAGTAATAATTCAGATGATGAATCAGATGATGCGAAGACACAAGTTCATGCGAGAACA 
AGAAAGCCGGTGACTAAAAGAAAACGAAGCACAGAAGTCCATAAGTTATATGAAAGAAAA 
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CGAAGAGATGAATTCAACAAGAAAATGCGTGCTTTGCAGGACCTACTACCAAATTGTTAC 

AAGGATGATAAGGCTTCATTGTTGGATGAGGCTATCAAATATATGCGGACCCTTCAACTT 

CAAGTTCAGATGATGAGTATGGGAAATGGATTAATAAGACCACCTACGATGTTGCCAATG 

GGTCATTACTCTCCCATGGGTCTAGGAATGCATATGGGTGCAGCAGCAACACCAACATCA 

ATACCGCAATTCCTGCCTATGAATGTTCAAGCAACCGGTTTTCCGGGGATG71ACAATGCA 

CCACCACAAATGCTAAGCTTTCTTAATCACCCAAGTGGACTAATTCCAAACACTCCTATC 

TTTTCTCCATTGGAAAATTGCTCTCAGCCATTCGTGGTGCCTTCGTGTGTTTCTCAGACT 

(^GGCrTACTTCTTTTACTCAATTCCCAAAGTCTGCGTCCGCCTCAAACTTAGAAGATGCA 

ATGCAATATAGAGGAAGCAACGGTTTTAGTTATTATCGCTCGCCAAACTAATGATTTGTA 

GAAAGTTGATGTTTTCTCCAACTAACTAACTTTAAGCAAAAAAAAATGATCGTCTACTCT 

GTGTTGTTAGTCTATGGGCTTTTGGGCCTTGATTCTTGGAACGATT 

ACTATTTTCAAAGTGGATGTACAAAGTAAAA 

>G1649 Amino Acid Sequence (conserved domain in AA coordinates : 225-295) 

MEAKPLASSSSEPNMISPSSNIKPKLKDEDYMELVC^^ 

SLLDLYETEYSEGFKKlflTKILGDTQVVPVSQSKPQQDI^ 

NVSKSNKCV^SSTLIDVSAKGPKNVEVTTAPPDEQSAAVGRSTELYFASSSKFSRGTSRD 

LSCCSLKRKYGDIEEEESTYLSl^SDDESDDAKTQVHARTRKPVTKRKRSTEVHKLYERK 

RRDEFNKKMRALQDLLPNCTKDDKASLLDEAIKYMRTLQLQVQMMSMG 

GHYS PMGLGMHMGAAATPTS I PQFLPMNVQATGFPGMNNAPPQMLSFLNHPSGL I PNTPI 

FSPLENCSQPFWPSCVSQTQATSFTQFPKSASASNLEDAMQYRGSNGFSYYRSPN* 

>G2131 (69.. 1010) 

GTCTCTCATTTTCATAATTCCATTTTCAGGATrGTCTCT 

C^CCGGTAATGGCAAAAGTCTCTGGGAGGAGCAAGAAAACAATCGTTGACGATGAAATCA 
GCGATAAAACAGCGTCTGCGTCTGAGTCTGCGTCCATTGCCTTAAC^TCC^^CGC^AAC 
GTAAGTCGCCGCCTCGAAACGCTCCTCTTCAACGCAGCTCCCCTTACAGAGGCGTCACAA 
GGCATAGATGGACTGGGAGATACGAAGCGCATTTGTGGGATAAGAACAGCTGGAACGATA 
CACAGACCAAGAAAGGACGTCAAGTTTATCTAGGGGCTTACGACGAAGAAGAAGCAGCAG 
CACGTGCCTACGACTTAGCAGCATTGAAGTACTGGGGACGAGACACACTCTTGAACTTCC 
CTTTGCCGAGTTATGACGAAGACGTCAAAGAAATGGAAGGCCAATCCAAGGAAGAGTATA 
TTGGATCATTGAGAAGAAAAAGTAGTGGATTTTCTCGCGGTGTATCAAAATACAGAGGCG 
TTGCAAGGCATCACCATAATGGGAGATGGGAAGCTAGAATTGGAAGGGTGTTTGGTAATA 
AATATCTATATCTTGGAACATACGCCACGCAAGAAGAAGCAGCAATCGCCTACGACATCG 
CGGCAATAGAGTACCGTGGACTTAACGCCGTTACCAATTTCGACGTCAGCCGTTATCTAA 
ACCCTAACGCCGCCGCGGATAAAGCCGATTCCGATTCTAAGCCCATTCGAAGCCCTAGTC 
GCGAGCCCGAATCGTCGGATGATAACAAATCTCCGAAATCAGAGGAAGTAATCGAACCAT 
CTACATCGCCGGAAGTGATTCCAACTCGCCGGAGCTTCCCCGACGATATCCAGACGTATT 
TTGGGTGTCAAGATTCCGGCAAGTTAGCGACTGAGGAAGACGTAATATTCGATTGTTTCA 
ATTCTTATATAAATCCTGGCTTCTATAACGAGTTTGATTATGGACCTTAATCGTATTTTC 
TACAAGTTTTGTTTTGATTATCTAGACAATACATCAATATATTCT 

>G2131 Amino Acid Sequence (conserved domain in AA coordinates : 50-186 , 112-183) 

MAKVSGRS KKTIVDDE I SDKTAS ASESAS I ALTS KRKRKS PPRNAPLQRSS PYRGVTRHR 

WTGRYEAHLWDKNSWNDTQTKKGRQWLGAYDEEEAAARAYDLAALKYWGRD 

SYDEDVKEMEGQSKEEYIGSLRRKSSGFSRGVSKYRGVARHHHNGRWRARIGRVFGNKY^ 

YLGTYATQEEAAI AYD I AAI EYRGLNAVTNFDVSRYLNPNAAADKADSD S KP IRS P SRE P 

ESSDDNKSPKSEEVIEPSTSPEVIPTRRSFPDDIQTYFGCQDSGKLATEEDVIFDCFNSY 

INPGFYNEFDYGP* 

>G215 (1..1110) 

ATGACTCGTCGGTGTTCGCATTGTAGCAACAATGGGCAC^ATTCACGCACGTGTCCAACG 
CGTGGGTCTGGTTCCTCCTCCGCCGTGAAGTTATTTGGTGTGAGGTTAACGGATGGCTCG 
ATTATTAAAAAGAGTGCGAGTATGGGTAATCTCTCGGCATTGGCTGTTGCGGCGGCGGCG 
GCAACGCACCACCGTTTATCTCCGTCGTCTCCTCTGGCGACGTCAAATCTTAATGATTCG 
CCGTTATCGGATCATGCCCGATACTCTAATTTGCATCATAATGAAGGGTATTTATCTGAT 
GATCCTGCTCATGGTTCTGGGTCTAGTCACCGTCGTGGTGAGAGGAAGAGAGGTGTTCCT 
TGGACTGAAGAGGAACATAGACTATTCTTAGTCGGTCTTCAGAAACTCGGGAAAGGAGAT 
TGGCGCGGTATTTCGAGAAACTATGTAACGTCAAGAACTCCTACACAAGTGGCTAGTCAT 
GCTCAAAAGTATTTTATTCGACATACTAGTTCAAGCCGCAGGAAAAGACGGTCTAGCCTC 
TTCGACATGGTTACAGATGAGATGGTAACCGATTCATCGCCAACACAGGAAGAGCAGACC 
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TTAAACGGTTCCTCTCCAAGCAAGGAA.CCTGAAAAGAAAAGCTACCTTCCTTCACTTGAG 
CTCTCACTCAATAATACCACAGAAGCTGAAGAGGTCGTAGCCACGGCGCCACGACAGGAA 
AAATCTCAAGAAGCTATAGAACCATCAAATGGTGTTTCACCAATGCTAGTCCCGGGTGGC 
TTCTTTCCTCCTTGTTTTCCAGTGACTTACAC^ 

ACAGAACATGCCTTAAACGCTGAGACTTCTTCTCAGCAGCATCAGGTCCTAAAACCAAAA 
CCTGGATTTGCTAAAGAACGTGTGAACATGGACGAGTTGGTCGGTATGTCTCAGCTTAGC 
ATAGGAATGGCGACAAGACACGAAACCGAAACTTCCCCTTCCCCGCTATCTTTGAGACTA 
GAGCCCTCAAGGCCATCAGCGTTTCACTCGAATGGCTCGGTTAATGGTGCAGATTTGAGT 
AAAGGCAACAGCGCGATTCAGGCTATCTAA 

>G215 Amino Acid Sequence (domain in AA coordinates: TBD) 

MTRRCSHCSNNGHNSRTCPTRGSGSSSAVKLFGVRLTDGSIIKKSASMGNLSALAVAAAA 

ATHHRLSPSSPLATSNLNDSPLSDHARYSNLHHNEGYLSDDPAHGSGSS^^ 

WTEEEHRLFLVGLQKLGKGDWRGI SRNYVTSRTPTQVASHAQKYFIRHTS S SRRKRRSSL 

FDMVTDEMVTDSSPTQEEQTLNGSSPSKEPEKKSYLPSLELSLNNTTEAEEWATAPRQ^ 

KSQEAIEPSNGVSPMLVPGGFFPPCFPVTYTIWLPASLHGTEHALN7VETSSQQHQVLKPK 

PGFAKERVNMDELVGMSQLSIGMATRHETETSPSPLSLRLEPSRPSAFHSNGSVNGADLS 
KGNSAIQAI* 

>G1508 (1. .420) 

ATGCTAGATCACAGTGAAAAGGTCTTATTGGTTGATTCAGAAACCATGAAAACAAGAGCT 

GAAGATATGATCGAACAGAACAACACTAGTGTTAACGACAAGAAGAAGACTTGTGCTGAT 

TGTGGAACCAGTAAAACTCCTCTTTGGCGTGGTGGTCCTGTTGGTCCAAAGTCGTTGTGT 

AACGCGTGTGGGATC^GAAACAGAAAGAAGAGAAGAGGAGGAACAGAAGATAATAAGAAA 

TTAAAGAAATCGAGTTCTGGTOGCGGAAACCGTAAATTTGGTGAATCGTTAAAACAGAGT 

TTGATGGATTTGGGGATAAGGAAGAGATCAACGGTGGAGAAGCAACGACAGAAGCTTGGT 

GAAGAAGAACAAGCCGCTGTGTTACTCATGGCTCTTTCTTATGGCTCTGTTTACGCTTAG 

>G1508 Amino Acid Sequence (domain in AA coordinates: 38-63) 

^DHSEKVLLVDSETMKTRAEDMIEQ 

NACGIRNRKKJTCGGTEDNKKLKKS^ 

EEEQAAVLLMALSYGSVYA* 

>G2110 (36.. 1622) 

GAGAGCTAATAAAAAATTTATCAAAGAAGACTAATATGGAGAAGGACGATTTCTTGAGGA 
GTGGTCATGGAAGAGAAGAAAGCCATGATGAGATGAGAAAACTTGATTCATCTCACGATG 
ATTCTCATCAAGAACACGACCATATTATAAGATCCAAGTTGGACTCAACTAAAGTCGAAA 
TGGATGAGGCTAAAGAGGAAAATCGAAGACTAAAGTCATCATTGAGTAAAATCAAGAAAG 
ATTTTGACATCCTTCAAACACAATACAACCAATTAATC 

AGTTCCAATCAAAAGGGCATCATCAAGACAAAGGCGAAGATGAAGACAGAGAAAAAGTTA 
ACGAACGTGAAGAACTTGTCTCGTTGAGCCTAGGCAGACGGTTAAATTCAGAGGTTCCAA 
GTGGTTCGAATAAAGAAGAAAAAAATAAAGATGTTGAAGAAGCGGAAGGTGACAGAAATT 
ATGATGATAATGAAAAAAGCAGTATTCAAGGGTTGAGTATGGGGATTGAATACT^GGCTT 
TGAGTAATCCTAATGAGAAGTTAGAGATTGATCATAATGAAGAAACCATGTCGTTGGAGA 
TTAGTAACAATAATAAGATCAGATCACAAAATAGTTTTGGGTTTAAGAATGATGGAGATG 
ATCATGAAGATGAAGATGAGATTTTGCCTCAAAACCTTGTTAAGAAAACTAGGGTTTCGG 
TGAGATCAAGATGTGAGACACCAACGATGAACGACGGATGTCAATGGAGGAAATATGGCC 
AGAAAATAGCTAAAGGCAATCCATGTCCCCGAGCTTACTATCGTTGCACCATTGCAGCTT 
CTTGTCCAGTAAGAAAACAGGTGCAAAGATGTTCAGAAGATATGTCTATACTTATCTCAA 
CGTACGAAGGAACACATAACCATCCACTTCCCATGTCAGCAACTGCCATGGCCTCTGCCA 
CTTCCGCTGCCGCCTCCATGCTTCTCTCCGGCGCCTCCTCCTCCTCATCCGCCGCAGCTG 
ATCTTCATGGCCTTAACTTCTCTCTTTCCGGCAACAACATCACTCCAA7UVCCTAAAACTC 
ATTTCCTCCAATCCCCTTCTTCTTCTGGCCATCCGACCGTCACTCTCGACCTCACAACCT 
CCTCCTCGTCGCAGCAACCGTTCTTATCAATGCTCAATAGATTCAGCTCTCCTCCAAGTA 
ATGTCTCACGATCTAATAGTTATCCTTCAACCAATCTCAACTTTTCAAACAACACCAACA 
CATTGATGAATTGGGGTGGTGGTGGTAATCCCAGTGATCAATACCGTGCAGCTTACGGCA 
ACATTAACACCCATCAGCAATCACCTTACCACAAAATCATTCAAACCCGAACCGCCGGGT 
CATCTTTCGATCCGTTTGGAAGATCATCTTCATCACATTCTCCACAAATAAATCTTGATC 
ATATCGGAATCAAGAACATCATCAGTCACCAAGTGCCATCTTTACCGGCTGAAACAATCA 
AGGCAATCACGACAGATCCAAGTTTCCAATCGGCTTTGGCGACAGCTCTATCTTCCATCA 
TGGGCGGCGATTTAAAGATTGATCACAATGTGACTAGAAATGAAGCTGAGAAGAGCCCTT 



274 



WO 03/013227 



275/286 



PCT/US02/25805 



AAAGAGAATTGTTATATATATGTTCTTATATACTCAGTACATTGGTAAATGGGTTTAGAC 
TTTCACTAGTTTCCTAGTTCATCTATATAT^ 

TTGGAGTTTATGGAACTAATGTGTACATATGAAACTTTAGAACGAATAAATAAAACTTGG 
AATTCCTTTTTAAAAAAAAAAAAAAAAA 

>G2110 Amino Acid Sequence (conserved domain in AA coordinates : 239-298) 
MEKDDFLRSGHGREESHDEMRKLDSSHDDSHQEHDHI IRSKLDSTKVEMDEAKEENRRLK 

SSLSKIKKDFDILQTQYNQLMAKHNEPTKPQSKGHHQDKGEDEDREKWEREELVSLSLG 

RRLNSEVPSGSNKEEKNKDV^EAEGDRNTO^ 

NQETMSLEISNNNKIRSQNSFGFKND^ 

GCQWRKYGQKIAKGNPCPRAYYRCTIAASCPTOKQVQRCSEDMSILISTY^GTHNHPLPM 

satamasatsaaasmllsgasssssaaadlhglnfslsgnnitpkpkthflqspsssghp 
wtldlttssssqqpflsml^fssppsotsrsnsypstnlnfsnn™ 

DQYRAAYGNINTHQQSPYHKIIQTRTAGSSFDPFGRSSSSHSPQINLDHIGIKNIISHQV 

PSLPAETIKAITTDPSFQSALATALSSIMGGDLKIDHNVTRNEAEKSP* 

>G2442 (71.. 997) 

TCGACC^ATTTAGACCATTCCAAATTCGTCGTCCTTTTCTCTGTGTAGTCTAATTATATA 
. TTACAAGTAGATGAATTGGTTACCTGAAGCTGAAGCTGAGGAGCACTTGAAAGGTATTCT 
CTOTGGTGATTTCTTTGATGGTCTCACCAATCACCTTGATTGCCCACTTGAAGACATCGA 
TTCCACCAATGGTGAGGGAGATTGGGTCGCCAGGTTTCAAGACCTTGAGCCTCCTCCC^ 
GGATATGTTCCCTGCTTTGCCTTCTGACCTCACCTCTTGTCCCAAGGGCGCCGCTCGTGT 
GCGGATTCCCAACAACATGATTCCTGCTTTGAAGCAGTCCTGTTCTTCTGAAGCCTTGTC 
CGGCATTAATAGCACTCCCCACCAATCTTCAGCTCCTCCTGATATCAAAGTTTCATATCT 
ATTTC^GTCTCTAACTCCAGTGTCAGTTCTCGAGAACAGTTATGGTTCTCTCTCCACCCA 
AAACTCCGGATCTCAGAGATTGGCTTTCCCTGTGAAAGGCATGAGAAGCAAGCGCAGACG 
CCCCACAACAGTGAGACTTAGCTACCTTTTCCCCTTTGAACCCAGAT^AGTCAACTCCGGG 
TGAATCAGTAACCGAGGGTTACTATTCTTCTGAGCAACATGCCAAGAAGAAGCGCAAGAT 
TC71TCTGATCACCCACACCGAGTCTTCCACTTTGGAGTCAAGTAAGTCGGATGGGATAGT 
CCGGATATGCACTC^TTGTGAGACAATCACGACCCCACAGTGGAGGCAAGGACCCAGTGG 
ACCCAAGACCCTCTGCAACGCTTGCGGAGTCCGGTTCAAATCTGGTCGCCTAGTTCCAGA 
ATACCGGCCAGCCTCAAGCCCGACCTTCATCCCATCTGTGCATTCAAACTCACACAGGAA 
GATCATTGAGATGAGAAAGAAGGACGACGAGTTTGATACCAGCATGATTCGCAGTGATAT 
CCAGAAGGTAAAGCAGGGGAGGAAGAAAATGGTATAAAAGTA 

>G2442 Amino Acid Sequence (domain in aa coordinates: 220-246) 

MNWLPEAEAEEHLKGILSGDFFDGLTiraLDCPLEDIDSTNGEGDWARFQDLEPPPLDMF 

PALPSDLTSCPKGAARTOIPNNMIPALKQSCSSEALSGINSTPHQSSAPPDIKVSYLFQS 

LTPVSVliENSYGSLSTQNSGSQRLAFPVKGMRSKRRRPTTVRLSYLFPFEPRKSTPGESV 

TEGYYSSEQHAKKKRKIHLITHTESSTLESSKSDGIVRICTHCETITTPQWRQGPSGPKT 

LCNACGVRFKSGRLVPEYRPASSPTFIPSV11SNSHRKIIEMRKKDDEFDTSMIRSDIQKV 
KQGRKKMV* 

>G1051 (66.. 1031) 

CCTGTAAATTGAGATTTGCTTTCTTTGGTAA 

CTTCAATGG(^C7^CTCCCTCCTAAAATCCCC^(^TGACACAACATTGGCCTGATTTCT 
CTTCCCAAAAGCTCTCTCCTTTCTCTACCCCAACCGCAACCGCTGTCGCCACCGCTACAA 
CCACCGTACAAAACCCCTCATGGGTCGACGAATTCCTCGACTTCTCAGCGTCTCGCCGTG 
GCAACCACCGTCGTTCCATCAGCGACTCTATCGCATTCCTCGAAGCTCCAACAGTCAGCA 
TCGAAGACCACCAATTCGACAGGTTCGATGACGAACAGTTCATGTCGATGTTCACCGACG 
ACGACAACCTTCATAGCAATCCTTCCCATATCJ^ 

CGGGATCTTCCTCGAACACATCCACGCCGTCCAATAGCTTCAACGACGATAACAAAGAAT 
TACCACCGTCCGATCATAACATGAACAATAATATCAACAACAACTATAACGATGAAGTCC 
AAAGCCAATGCAAGATGGAGCCAGAAGATGGTACGGCGTCGAATAACAATTCCGGTGATA 
GCTCCGGCAACCGGATTCTCGATCCCAAAAGGGTTAAGAGAATATTAGCAAATCGGCAAT 
CAGCACAGAGATCAAGGGTGAGGAAACTGCAATACATATCAGAGCTCGAACGTAGCGTCA 
CTTCGTTGCAGGCGGAAGTGTCAGTGTTATCGCCAAGAGTTGCATTCTTGGATCATCAAC 
GTTTGCTTCTTAACGTTGACAACAGCGCTCTCAAGCAACGAATCGCTGCTTTATCTC7UVG 
ACAAGCTTTTCAAAGACGCACATCAAGAAGCATTGAAGAGAGAAATAGAGAGACTTCGAC 
AAGTGTATAATCAACAAAGCCTCACGAATGTGGAAAATGCAAATCATTTATCGGCGACCG 
GAGCCGGTGCTACTCCGGCCGTCGACATCAAGTCGTCCGTTGAAACAGAGCAGCTCCTCA 
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ATGTCTCATAAATTAACCATCATGCATCATCATCAACATTTCTCTCTTTTAGCTTCTTGG 

CAAAAGTTCTTGACTATAAAATCTCTTTCGGGTAAGAAATTCAGGAGATATACATTTTTT 

ATTCTAATCACATTGTTTTTAAGTTGTGATGAATTCAGTTTGATGTATCTTATTTATTTT 

GTTTATGTCGTCTTTTTTTCTTGK3GGTTGATGGAAGGGAATCATCAATTGTTGTTTC 

AAAGAACTAGTTGAATTTTTTTTTTTTTTTT 

>G1051 Amino Acid Sequence (domain in AA coordinates 189-250) 
MAQLPPKI PNMTQHWPDFS SQKLS PFSTPTATAVATATTTVQNPS WVDEFLDFSASRRGN 
HRRS I SDS I AFLEAPTVS I EDHQFDRFDDEQFMSMFTDDDNLHSNPSH INNKNNNVG PTG 
SSSNTSTPSNSFOTDNKELPPSDHNMNNNINNN 

GNR I LDPKRVKR I LANRQS AQRSRVRKLQY I S ELERS VTS LQ AEVS VLS PRVAFLDHQRL 
LLNVDNSAL KQR I AALS QDKLFKDAHQEALKRE IERLRQVYNQQSLTNVENANHLSATGA 
GATPAVDIKSS VETEQLLNVS * 
>G1052 (138. .1127) 

TGATCATCTAAAACTTTCAATTTCTCTCTTC^ 
TCAAATCTTTGATCCTTTCCTTTGTTT^ 

CCATTAAATCTTTATTAATGGCACAACTTCCTCCGAAAATCCCAACCATGACGACGCCAA 
ATTGGCCTGACTTCTCCTCCCAGAAACTCCCTTCCATAGCCGCAACGGCGGCAGCCGCAG 
CAACCGCTGGACCTCAACAACAAAACCCTTCATGGATGGATGAGTTTCTCGACTTCTCAG 
CGACTCGCCGTGGGACTCACCGTCGTTCTATAAGCGACTC 

CTTCCTCCGGCGTCGGAAACCACCACTTCGATAGGTTTGACGACGAGCAATTCATGTCCA 
TGTTCAACGACGACGTACACAACAATAACCACAATCATCATCATCATCACAGCATCAACG 
GCAATGTGGGTCCCACGCGTTCATCCTCCAACACCTCCACGCCGTCCGATCATAATAGCC 
TTAGCGACGACGACAACAACAAAGAAGCACCACCGTCCGATCATGATCATCACATGGACA 
ATAATGTAGCCAATCAAAACAACGCCGCCGGTAACAATTACAACGAATCAGACGAGGTCC 
AAAGCCAGTGCAAGACGGAGCCACAAGATGGTCCGTCGGCGAATCAAAACTCCGGTGGAA 
GCTCCGGTAATCGTATTCACGACCCTAAAAGGGTAAAAAGAATTTTAGCAAATAGGCAA^ 
CAGCACAGAGATCAAGGGTGAGGAAATTGCAATACATATCAGAGCTTGAAAGGAGCGTTA 
CTTCATTGCAGACTGAAGTGTCAGTGTTATCGCCAAGAGTTGCGTTTTTGGATCATCAGC 
GATTGCTTCTCAACGTCGACAATAGTGCTATCAAGCAA^^ 

ATAAGATTTTCAAAGACGCTCATC^GAAGCATTGAAGAGAGAAATAGAGAGACTTCGAC 
AAGTATATCATCT^ACAAAGCCTCAAGAAGATGGAGAATAATGTCTCCGATCAATCTCCGG 
CCGATATCAAACCGTCCGTTGAGAAGGAACAGCTCCTCAATGTCTAAAGCTGTTCGTTCA 
CTAAGATCTTTCTTTTCATGGCGAAAAGATTCTTGACTATAAAACCTCTTTGTGTCAAGA 
AATTAATTTATCAAAGAAGATGGCCTTTTTTATTTGATCTAATCACATTTTTTTAAGTTG 
TGATGAATTTGCTTTTGATGTATCTGTTTTTTTTTTTTTTTTTT 

>G1052 Amino Acid Sequence (domain in AA coordinates 201-261) 
MAQLPPKI PTMTTPNWPDFSSQKLPSIAATAAAAATAGPQQQNPSWMDEFLDFSATRRGT 
HRRSISDSIAFLEPPSSGVGNHHFDRFDDEQFMSMFNDDVHNNNHNHHHHH 
RSSSNTSTPSDHNSLSDDDNNKKAPPSDHDHHMDNNVANQNNA 

EPQDGPSANQNSGGSSGNRIHDPKRVKRILANRQSAQRSRVRKLQYISELERSVTSLQTE 
VS VLS PRVAFLDHQRLLLNVDNSAI KQRIAALAQDKI FKDAHQEALKRE I ERLRQVYHQQ 
SLKKMENNVSDQSPADIKPSVEKEQLLNV* 
>G1079 (1..1995) 

ATGGGTTGTGCTGCTTCAAGAATTGATAATGAAGAAAAGGTTTTAGTGTGTAGGCAGAGA 
AAGAGGCTAATGAAAAAGTTATTAGGGTTCAGGGGAGAATTTGCAGATGCACAGTTGGCT 
TATCTTAGAGCTTTGAGGAACACTGGTGTTACTCTTAGGCAATTCACTGAGTCTGAGACC 
TTGGAGCTTGAAAACACTAGTTATGGTTTAAGTTTGCCTTTGCCTCCTTCGCCTCCTCCT 
ACATTGCCTCCTTGACCTCCACCACCTCCTCCATTTAGCCCGGATTTGAGAAATCCTGAG 
ACTAGTCATGACTTGGCTGATGAGGAGGAAGAGGGTGAAAATGATGGTGGTAATGATGGA 
AGTGGTG(^GCTCCTeCGCCTCCATTGCCGAATTCTTGGAACATTTGGAACCCTTTTGAG 
TCACTTGAGCTGCATAGTCATCCAAATGGTGACAATGTAGTTACACAAGTTGAACTGAAG 
AAGAAACAACAAATTCAGCAAGCTGAAGAGGAAGATTGGGCGGAGACGAAGTCTCAATTT 
GAGGAAGAAGATGAGCT^ACAAGAAGCAGGAGGTACTTGCCTTGATTTGAGTGTTCATCAA 
ATAGAGGCTGTTAGTGGCTGTAACATGAAGAAGCCACGTCGTCTGAAGTTTAAGCTGGGA 
GAAGTTATGGACGGTAACTCATCTATGACAAGCTGCTCCGGTAAAGATCTTGAGAAAACT 
CATGTGACTGATTGTAGAATCAGGAGGACCTTAGAAGGAATCATCAGAGAGTTGGATGAT 
TATTTTCTTAAAGCATCGGGTTGCGAGAAGGAGATAGCTGTGATAGTAGACATCAACAGT 
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AGGGATACTGTTGATCCTTTCAGGTACCAGGAAACAAGAAGGAAGAGAAGCAGCTCGGCA 

AAGGTATTCAGTGCATTGTCATGGAGTTGGTCTTCAAAGTCTCTTCAGTTGGGCAAAGAT 

GCTACAACAAGCGGGACTGTTGAACCCTGTAGGCCTGGAGCTC^CTGCAGCACACTTGAG 

AAGCTATACACAGCTGAGAAGAAACTTTACCAGCTAGTCAGAAACAAAGAGATTGCCAAA 

GTGGAGCATGAGAGGAAGTCTGCATTACTGCAAAAGCAAGATGGGGAAACCTATGATTTG 

AGCAAAATGGAGAAAGCACGCTTGTCTTTGGAGAGTTTGGAAACCGAGATACAGCGTCTA 

GAAGATTCCATAACTACAACACGCTCATGTTTGCTTAACTTGATCAATGATGAGCTGTAT 

CCGCAGCTAGTTGCTTTAACTTCAGGGCTAGCACAGATGTGGAAAACAATGCTCAAGTGT 

CATQU^GTTGAAATTCATATATCCCAGCAACTGAACCATCTTCCGGATTACCCGAGTATA 

GATCTCAGTTCGGAATACAAACGCCAGGCGGTTAATGAACTAGAGACCGAGGTTACTTGC 

TGGTACAATAGCTTTTGCAAGTTAGTAAATTCCCAGCGAGAATACGTGAAAACACTCTGT 

ACGTGGATCCAACTTACTGATCGCCTCTCTAACGAAGACAACCAAAGAAGTAGCTTGCCT 

GTTGCTGCTCGTAAGCTCTGCAAAGAGTGGCAGCTTGAATACAACCTGCGTAGGAAATGC 

AATAAACTTGAGAGGAGGCTTGAGAAAGAGCTAATTTCACTGGCTGAGATTGAAAGAAGG 

CTCGAGGGGATTTTAGCAATGGAAGAGGAGGAAGTAAGCTCAACGAGTTTGGGCTCTAAG 

CATCCGTTGTCAATCAAACAAGCCAAGATCGAAGCCTTGAGAAAACGAGTGGATATTGAG 

AAAACTAAGTACTTAAACTCGGTCGAGGTTAGTAAGAGAATGACACTAGACAACCTCAAA 

TCAAGCCTTCCCAATGTCTTTCAGATGTTGACTGCTCTAGCTAATGTCTTTGCCAATGGG 

TTTGAATCCGTTAATGGCCAAACCGGTACAGATC 

GAATCTCAACCCTAA 

>G1079 Amino Acid Sequence (conserved domain in AA coordinates : 1-50) 
MGCAASRIDliraEKVLVCRQRKRLMKKLL^ 

LELEOTSYGLSLPLPPSPPPTLPPSPPPPPPFSPDLimPETSHDIiADEEEKGENDGGNDG 
SGAAPPPPLPNSWNIWNPFESLELHSHPNGDNWTQVELKKKQQIQQAEEEDWAETKSQF 
EEEDEQQEAGGTCLDLSVHQIEAVSGCNMKKPRRLKFKIiGEvl^GNSSMTSCSGKDLEKT 
HVTDCRIRRTLEGIIRELDDYFLKASGCEKEIAVIVDINSI^TVDPFRYQETRRKRSSSA 
KVFSALSWSWSSKSLQLGKDATTSGTV^PCRPGAHCSTL^ 

V^HERKSALLQKQDGETYDLSKMEKARLSLESLETEIQRLEDSITTTRSCLLNLINDELY 

PQLVALTSGLAQMWKTI^KCHQVQIHI SQQLNHLPD YPS IDLSSEYKRQAVNELETEVTC 

WYNSFCKLWSQREYVKTLCTWIQLTDRLSNEDNQRSSLPVAARKLCKEWQLEYOTiRRKC 

NICLERRLEKELISLAEIERRLEGILAMEEEEVSSTSLGSKHPLSIKQAKIEALRKRVDIE 

KTK^LNSVEVSKRMTLDNLKSSLPNVTQM 

ESQP* • 

>G1335 (56.. 667) 

TTTTTTTTTAAAAGATTTAGAGAGAAAAGTGAGTTATTAAGAGATTCCAATCAAAATGAG 

CGGAGACAACGGCGGTGGTGAGAGGCGCAAAGGCTCCGTCAAGTGGTTTGATACCCAGAA 

GGGTTTCGGCTTCATCACTCCTGACGACGGTGGCGACGATCTCTTCGTTCACCAGTCCTC 

CATCAGATCTGAGGGTTTCCGTAGCCTCGCTGCCGAAGAAGCCGTAGAGTTCGAGGTTGA 

GATCGACAACAACAACCGTCCCAAGGCCATCGATGTTTCTGGACCCGACGGCGCTCCCGT 

CCAAGGAAACAGCGGTGGTGGTTCATCTGGCGGACGCGGCGGTTTCGGTGGAGGAAGAGG 

AGGTGGACGCGGATCTGGAGGTGGATACGGCGGTGGCGGTGGTGGATACGGAGGAAGAGG 

AGGTGGTGGTCGAGGAGGCAGCGACTGCTACAAGTGTGGTGAGCCCGGTCACATGGCGAG ' 

AGACTGTTCTGAAGGCGGTGGAGGTTACGGAGGAGGCGGCGGTGGCTACGGAGGTGGAGG 

CGGATACGGCGGAGGAGGTGGTGGTTACGGAGGTGGTGGCCGTGGAGGTGGTGGCGGCGG 

GGGAAGCTGCTACAGCTGTGGCGAGTCGGGACATTTCGCCAGGGATTGCACCAGCGGTGG 

ACGTTAAAACCAACGCCGGTTACGCGGTGGAGAAGAGTGAGTTGGTTATCTCACAAGTGA 

TCGGTTCTTTCTCCCGCCGCCTTCTATCTCTCTATTATCCACTTTTTGCTTATTATGATG 

GATCTCTATCTTTGTTAGTTGGTTTTTTCTTGATGGTTTCGGATTAGGACTCTTCTTTTG 

GTTTTGCTACTTATGGTTGGTTTTATTTATGGTACTTGTGATATGGGTGAAATGCTCTAC 

TTGTTGCTCTGTTTCAAGTGTTCATAATATGCGAACAAATATTCTGGGTTTTGTTTCAAA 
AAAAA 

>G1335 Amino Acid Sequence (domain in AA coordinates: 24-43, 131-144, 185-203) 

MSGDNGGGERRKGSVKWFDTQKGFGFITPDDGGDDLFVHQSSIRSEGFRSLAAEEAVEFE 

VEIDNNNRPKAIDVSGPDGAPVQGNSGGGSSGGRGGFGGGRGGGRGSGGGYGGGGGGYGG 

RGGGGRGGSDCYKCGEPGHMARDCSEGGGGYGGGGGGYGGGGGYGGGGGGYGGGGRGGGG 
GGGSCYSCGESGHFARDCTSGGR* 

>G157 (31.. 621) 
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GGGCATAACCCTTATCGGAGATTTGAAGCCATGGGAAGAAGAAAAATCGAGATCAAGCGA 

ATCGAGAACAAAAGCAGTCGACAAGTCACTTTCTCCAAACGACGCAATGGTCTCATCGAC 

AAAGCTCGACAACTTTCGATTCTCTGTGAATCCTCCGTCGCTGTTGTCGTCGTATCTGCC 

TCCGGAAAACTCTATGACTCTTCCTCCGGTGACGACATTTCCAAGATCATTGATCGTTAT 

GAAATACAACATGCTGATGAACTTAGAGCCTTAGATCTTGAAGAAAAAATTCAGAATTAT 

CTTCCACACAAGGAGTTACTAGAAACAGTCC^UUGCAAGCTTGAAGAACCAAATGTCGAT 

AATGTAAGTGTAGATTCTCTAATTTCTCTGGAGGAACAACTTGAGACTGCTCTGTCCGTA 

AGTAGAGCTAGGAAGGCAGAACTGATGATGGAGTATATCGAGTCCCTTAAAGAAAAGGAG 

AAATTGCTGAGAGAAGAGAACCAGGTTCTGGCTAGCCAGATGGGAAAGAATACGTTGCTG 

GCAACAGATGATGAGAGAGGAATGTTTCCGGGAAGTAGCTCCGGCAACAAAATACCGGAG 

ACTCTCCCGCTGCTCAATTAGCCACCATCATCAACGGCTGAGTTTTCACCTTAAACTCAA 

AGCCTGATTCATAATTAAGAGAATAAATTTGTATATTATAAAAAGCTGTGTAATCTCAAA 

CCTTTTATCTTCCTCTAGTGTGGAATTTAAGGTCAAAAAGAAAACGAGAAAGTATGGATC 

AGTGTTGTACCTCCTTCGGAGACAAGATCAGAGTTTGTGTGTTTGTGTCTGAATGTACGG 

ATTGGATTTTTAAAGTTGTGCTTTCTTTCTTCAAAAAAAAAAA 

>G157 Amino Acid Sequence (domain in AA coordinates: 2-57) 

MGRRKIEIKRIENKSSRQVTFSKRRNGLIDKARQLSILCESSVAVVVVSASGKLYDSSSG 

DDISKIIDRYEIQHADBLRALDLEEKIQNYLPHKELLETVQSKLEEPNVDNVSVDSLISL 

EEQLETALSVSRARKAELMMEYIESLKEKEKLLREEN 

GSSSGNKIPETLPLLN* 

>G1895 (1..954) 

ATGAATAACCAATCTGTTACTGACAATACAAGTCTTAAGCTGTCATCTAATCTTAACAAC 
GAGTC^U^GAAACATCTGAGAACAGTGATGACC^CACAGCGAGATCAC^CAATTACA 
TCGGAAGAAGAGAAAACAACTGAACTGAAGAAACCAGACAAGATTCTTCCATGTCCGAGA 
TGCAACAGCGCAGACACCAAATTCTGTTACTACAACAACTACAACGTTAACCAGCCACGT 
CACTTCTGTAGAAAATGCCAGAGGTATTGGACCGCTGGTGGATCCATGAGGATCGTCCCG 
GTTGGCTCAGGCCGTCGCAAGAACAAGGGATGGGTTTCTTCAGACCAGTACCTGCACATC 
ACTTCCGAGGATACTGACAATTACAATAGCTCCTCAACAAAGATTCTAAGCTTCGAGTCT 
TCGGACTCTTTGGTAACTGAGAGGCCTAAGCATCAATCAAACGAAGTGAAGATAAACGCT 
GAACCTGTTTCACAAGAACCCAACAACTTCCAAGGGTTACTTCCT 

GTTTCGCCTCCTTGGCCTTACCAATACCCTCCAAACCCTAGTTTCTACCACATGCCCGTC 

TACTGGGGCTGCGCGATACCGGTTTGGTCTACCCTCGACACTTCTACATGTCTTGGGAAA 

AGGACAAGAGACGAAACTTCTCATGAAACTGTTAAAGAGAGTAAAAATGCTTTTGAGAGA 

ACAAGCTTGCTTTTGGAATCTCAGAGCATCAAAAATGAAACAAGTATGGCTACAAATAAC 

CATGTGTGGTATCCAGTACCGATGACCCGCGAGAAGACACAAGAATTCAGCTTTTTCAGT 

AATGGAGCTGAAACAAAGAGCAGCAACAACAGATTCGTCCCTGAAACGTATCTTAACCTG 

CAAGCAAACCCTGCAGCCATGGCAAGATCTATGAACTTCAGAGAGAGCATATAA 

>G1895 Amino Acid Sequence (domain in AA coordinates: 55-110) 

Ml^QSVTDOTSLKLSSNLNNESra 

CNSADTKFCYYNNYNVNQPRHFCRKCQRYWTAGGSMRIVPVGSGRRKNKGWVSSDQYLHI 
TSEDTDimFSSSTKILSFESSDSLVra^ 

VS PPWPYQ YPPNPS FYHMPVYWGCAI P VWSTLDTS TCLGKRTRDETSHETVKES KNAFER 
TSLLLESQSIKNETSMATNNHVWYPVPMTREKTQEFSF 
QANPAAMARSMNFRES I * 
>G1900 (1..897) 

ATGCTGGAAACTAAAGATCCTGCGATAAAGCTCTTTGGTATGAAAATTCCTTTCCCGACG 
GTTTTAGAGGTTGCTGATGAAGAAGAAGAAAAGAACCAAAACAAGACATTAACTGATCAA 
TCGGAGAAAGACAA^CCCTAAAGAAACCAACCAAGATTCTTCCATGTCCAAGATGCAAC 
AGCATGGAGACTAAGTTCTGTTACTAC^CAACTACAACGTAAACCAACCTCGCCAT^ 
TGTAAAGCTTGTCAGAGATATTGGACCTCAGGTGGGACCATGAGAAGTGTTCCAATCGGA 
GCAGGACGGCGCAAGAACAAGAACAACTCACCAACTTCACATTACCACCATGTGACTATC 
TCCGAAACAAATGGTCCGGTCCTTAGTTTCAGCCTCGGAGATGATCAAAAGGTCTCGAGT 
AATAGGTTTGGTAATCAAAAGCTAGTTGCTAGGATAGAGAACAATGACGAGCGCTCTAAT 
AACAACACTTCGAACGGTTTGAATTGTTTTCCGGGAGTTTCGTGGCCGTACACGTGGAAT 
CCTGCGTTTTACCCGGTTTACCCTTATTGGAGCATGCCAGTGTTGTCTTCTCCGGTAAGT 
TCAAGTCCTACTTCTACTCTTGGTAAGCATTCGAGAGACGAAGACGAGACGGTGAAGCAA 
AAACAGAGGAATGGATCTGTATTGGTTCCAAAGACTTTGAGAATTGATGATCCTAATGAA 
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GCTGCAAAGAGTTCGATATGGACAACACTTGGGATCAAGAACGAAGTTATGTTCAATGGG 
TTTGGTTCGAAGAAAGAGGTTAAGCTCAGTAACAAAGAAGAAACAGAGACCTCACTTGTT 
CTTTGTGCA^CCCTGCTGCGTTATCAAGATCAATCAATTTCCATGAGCAGATGTGA 
>G1900 Amino Acid Sequence (domain in AA coordinates: 54-106) 

MLETKDPAIKLFGMKIPFPTVLEVADEEEEKNQNKTLTDQSEKDKTLKKPTKILPCPRCN 
SMETKFCYYNNYNVNQPRHFCKACQRYWT 

SETNGPVLSFSLGDDQKVSSNRFGNQKLVARIENNDERSNNNTSNGLNCFPGVSWPYTWN 
PAFYPVYPYWSMPVLSSPVSSSPTSTLGKHSRDEDETVKQKQRNGSVLVPKTLRIDDPNE 
AAKSSIWTTLGIKireVMFNGFGSKKEVKLSN^ 
>G2007 (1..861) 

ATGGGAAGGCAGCCATGTTGTGACAAGCTCATGGTGAAGAAGGGGCCGTGGACGGCGGAG 
GAAGACAAGAAACTGATAAACTTTATCTTGACCAACGGCCACTGTTGCTGGAGGGCTTTG 
CCGAAGCTGGCCGGTCTCCGTCGCTGTGGGAAGAGCTGCCGTCTACGGTGGACCAATTAT 
CTCCGACCTGACTTGAAGAGAGGTCTTCTCTCCGACGCCGAGGAACAGCTTGTCATCGAC 
CTTCATGCTCTTCTCGGCAACAGATGGTCCAAGATCGCTGCAAGATTACCAGGAAGAACA 
GACAACGAAATAAAAAATCATTGGAATACTCATATCAAGAAGAAGCTCCTTAAGATGG7VA 
ATCGATCCTTCGACCCATCAACCTTTAAACAAAGTATTTACCGATACAAACTTAGTCGAT 
AAATCTGAAACTTCATCGAAAGCCGACAATGTAAATGATAATAAAATCGTAGAGATCGAT 
GGGACAACGACAAATACAATAGATGATAGGATTATCACT 

GATTATGAATTACTTGGTGATATAATTCATAATTATGGAGATTTATTTAATATTCTATGG 

ACCAACGATGAACCTCCTCTAGTCGATGATGCATCATGGAGCAATCATAACGTTGGTATT 

GGAGGAACAGCTG(^GTTGCAGCCT(^GACAAGAACAACACTGCTGCCGAGGAAGATTTC 

CCGGAAAGATCATTTGAAAAACAGAACGGCGAAAGTTGGATGTTCTTGGATTATTGCCAA 

GAATTTGGTGTTGAAGATTTTGGGTTCGAGTGTTACCATGGTTTTGGTCAAAGCTCCATG 
AAGACGGGTCACAAGGACTAG ^ 

>G2007 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGRQPCCDKLMVKKGPWTAEEDKKLINFILT^ 

LRPDLKRGLLSDAEEQLVIDLHALLGNRWSKIAARLPGRTDNEIKNHWNT^ 

IDPSTHQPLNKVFTDTNLVDKSETSSKADNVNDNKIVEIDGTTTNTIDDSIITHQN 

DYELLGDIIHNYGDLFNILWTNDEPPLVDDASWSimWGIGGTAAVAASDKNOT 

PERSFEKQNGESWMFLDYCQEFGVEDFGFECYHGFGQSSMKTGHKD* 

>G214 (238.. 2064) 

TGAGATTTCTCCATTTCCGTAGCTTCTGGTCTCTTTTCTTTGTTTCATTGATCAAAAGCA 

AATCACTTCTTCTTCTTCTTCTTCTCGATTTCTTACTGTTTTCTTATCCAACGAAATCTG 

GAATTAAAAATGGAATCTTTATCGAATCCAAGCTGATTTTGTTTCTTTCATTGAATCATC 

TCTCTAAAGTGGAATTTTGTAAAGAGAAGATCTGAAGTTGTGTAGAGGAGCTTAGTGATG 

GAGACAAATTCGTCTGGAGJ^GATCTGGTTATTAAGACTCGGAAGCCATATACGATAACA 

AAGCAACGTGAAAGGTGGACTGAGGAAGAACATAATAGATTCATTGAAGCTTTGAGGCTT 

TATGGTAGAGCATGGCAGAAGATTGAAGAACATGTAGCAACAAAAACTGCTGTCCAGATA 

AGAAGTCACGCTCAGAAATTTTTCTCCAAGGTAGAGAAAGAGGCTGAAGCTAAAGGTGTA 

GCTATGGGTCAAGCGCTAGACATAGCTATTCCTCCTCCACGGCCTAAGCGTAAACCAAAC 

AATCCTTATCCTCGAAAGACGGGAAGTGGAACGATCCTTATGTCAAAAACGGGTGTGAAT 

GATGGAAAAGAGTCCCTTGGATCAGAAAAAGTGTCGCATCCTGAGATGGCCAATGAAGAT 

CGACAACAATCAAAGCCTGAAGAGAAAACTCTGCAGGAAGACAACTGTTCAGATTGTTTC 

ACTCATCAGTATCTCTCTGCTGCATCCTCCATGAATAAAAGTTGTATAGAGACATCAAAC 

GCAAGCACTTTCCGCGAGTTCTTGCCTTCACGGGAAGAGGGAAGTCAGAATAACAGGGTA 

AGAAAGGAGTCAAACTCAGATTTGAATGCAAAATCTCTGGAAAACGGTAATGAGCAAGGA 

CCTCAGACTTATCCGATGCATATCCCTGTGCTAGTGCCATTGGGGAGCTCAATAACAAGT 

TCTCTATCACATCCTCCTTCAGAGCCAGATAGTCATCCCCACACAGTTGCAGGAGATTAT 

CAGTCGTTTCCTAATC^TATAATGTCAACCCTTTTACAAACACCGGCTCTTTATACTGCC 

GCAACTTTCGCCTCATCATTTTGGCCTCCCGATTCTAGTGGTGGCTCACCTGTTCCAGGG 

AACTCACCTCCGAATCTGGCTGCCATGGCCGCAGCCACTGTTGCAGCTGCTAGTGCTTGG 

TGGGCTGCCAATGGATTATTACCTTTATGTGCTCCTCTTAGTTCAGGTGGTTTCACTAGT 

CATCCTCCATCTACTTTTGGACCATCATGTGATGTAGAGTACACAAAAGCAAGCACTTTA 

CAACATGGTTCTGTGCAGAGCCGAGAGCAAGAACACTCCGAGGCATCAAAGGCTCGATCT 

TCACTGGACTCAGAGGATGTTGAAAATAAGAGTAAACCAGTTTGTCATGAGCAGCCTTCT 

GCAACACCTGAGAGTGATGCAAAGGGTTCAGATGGAGCAGGAGACAGAAAACAAGTTGAC 
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CGGTCCTCGTGTGGCTCAAACACTCCGTCGAGTAGTGATGATGTTGAGGCGGATGCATCA 
GAAAGGCAAGAGGATGGCACCAATGGTGAGGTGAAAGAAACGAATGAAGACACTAATAAA 
CCTCAAACTTCAGAGTCCAATGCACGCCGCAGTAGAATCAGCTCCAATATAACCGATCCA 
TGGAAGTCTGTGTCTGACGAGGGTCGAATTGCCTTCCAAGCTCTCTTCTCCAGAGAGGTA 
TTGCCGCAAAGTTTTACATATCGAGAAGAACACAGAGAGGAAGAACAACAACAACAAGAA 
CAAAGATATCCAATGGCACTTGATCTTAACTTCACAGCTCAGTTAACACCAGTTGATGAT 
CAAGAGGAGAAGAGAAACACAGGATTTCTTGGAATCGGATTAGATGCTTCAAAGCTAATG 
AGTAGAGGAAGAACAGGTTTTAAACCATACAAAAGATGTTCCATGGAAGCCAAAGAAAGT 
AGAATCCTCAACAACAATCCTATCATTCATGTGGAACAGAAAGATCCCAAACGGATGCGG 
TTGGAAACTCAAGCTTCCACATGAGACTCTATTTTCATCTGATCTGTTGTTTGTACTCTG 
TTTTTAAGTTTTCAAGACCACTGCTACATTTTCTTTTTCl^TTC 

TTCCTTGTCC^TAGTCTTCCTGTAACATTTGACTCTGTATTATTCAACAAATCATAAACT 
GTTTAATCTTTTTTTTTCCA 

>G214 Amino Acid Sequence (domain in AA coordinates: 22-71) 

METNSSGEDLVIKTRKPYT I TKQRERWTEEEHNRF I EALRLYGRAWQKI EEHVATKTAVQ 

IRSHAQKFFSKVEKEAEAKGVAMGQALDIAIPPPRPKRKPNNPYPRKTGSGTILMSKTGV 

IsnDGKESLGSEKVSHPEMANEDRQQSKPEEKTLQEDNCSDCFTHQYLSAASSMNKSCIETS 

NASTPREPLPSREEGSQNNRWKESNSDLNAKSLENGNEQGPQTYPMHI PVLVPLGS S IT 

SSLSHPPSEPDSHPHTVAGDYQSFPNHIMSTLLQTPALYTAATFASSFWPPDSSGGSPVP 

GNSPPNLAAMAAATVAAASAWWAANGLLPLCAPLSSGGFTSHPPSTFGPSCDVEYTKAST 

LQHGSVQSREQEHSEASKARSSLDSEDVENKSKPVCHEQPSATPESDAKGSDGAGDRKQV 

DRSSCGSNTPSSSDDVEADASERQEDGTNGEVKBTNEDTNKPQTSESNARRSRISSNITD 

PWKSVSDEGRIAFQALFSREVLPQSFTYREEHREEEQQQQEQRYP^1?U^DLNFTAQLTPVD 

DQEEKROTGFLGIGLDASKLMSRGRTGFKPYKRCSMEAKESRILN]^PIIHVEQro 
RLETQAST* 

>G2155 (63.. 740) 

CTCATATATACCAACC^y^CCTCTCTCTGCATCTTTATTAAC^CAAAATTCCAAAAGATT 
AAATGTTGTCGAAGCTCCCTACACAGCGACACTTGCACCTCTCTCCCTCCTCTCCCTCCA 
TGGAAACCGTCGGGCGTCCACGTGGCAGACCTCGAGGTTCCAAAAACAAACCTAAAGCTC 
CAATCTTTGTCACCATTGACCCTCCTATGAGTCCTTACATCCTCGAAGTGCCATCCGGAA 
ACGATGTCGTTGAAGCCCTAAACCGTTTCTGCCGCGGTAAAGCCATCGGCTTTTGCGTCC 
TCAGTGGCTCAGGCTCCGTTGCTGATGTCACTTTGCGTCAGCCTTCTCCGGCAGCTCCTG 
GCTCAACCATTACTTTCCACGGAAAGTTCGATCTTCTCTCTGTCTCCGCCACTTTCCTCC 
CTCCTCTACCTCCTACCTCCTTGTCCCCTCCCGTCTCCAATTTCTTCACCGTCTCTCTCG 
CCGGACCTCAGGGGAAAGTCATCGGTGGATTCGTCGCTGGTCCTCTCGTTGCCGCCGGAA 
CTGTTTACTTCGTCGCCACTAGTTTCAAGAACCCTTCCTATCACCGGTTACCTGCTACGG 
AGGAAGAGCAAAGAAACTCGGCGGAAGGGGAAGAGGAGGGACAATCGCCGCCGGTCTCTG 
GAGGTGGTGGAGAGTCGATGTACGTGGGTGGCTCTGATGTCATTTGGGATCCCAACGCCA 
AAGCTCCATCGCCGTACTGACCACAAATCCATCTCGTTCAAACTAGGGTTTCTTCTTCTT 
TAGATCATC7UIGAATCAACAAAAAGATTGCATTTTTAGATTCTTTGTAATATCATAATTG 
ACTCACTCTTTAATCTCTCTATCACTTCTTCTTTAGCTTTTTCTGCAGTGTCAAACTTCA 
CATATTTGTAGTTTGATTTGACTATCCCCAAGTTTTGTATTTTATCATACAAATTTTTGC 



GAGATTGAATGTATAATATAATGGTTTAAT 
>G2155 Amino Acid Sequence (domain in AA coordinates : 18-38) 
MLSKLPTQRHLHLSPSSPSMETVGRPRGRPRGSKNKPKAPIFVTIDPPMSPYILEVPSGN 
DWEALNRFCRGKAIGFCVLSGSGSVADVTLRQPSPAAPGSTITFHGKFDLLSVSATFLP 
PLPPTSLSPPVSNFFTVSLAGPQGKVIGGFVAGPLVAAGTVYFVATSFKNPSYHRLPATE 
EEQRNSAEGEEEGQSPPVSGGGGESMYVGGSDVIWDPNAKAPSPY* 
>G234 (106.. 1035) 

CACAACATCATACCCACCAACATATATAATGTTGATCATAGAGAGATAAACAGAGGCCGC 
TATCAAGAACAAGACTAAGAACAAGACTTCACTAGGAGTACAAGTATGGGAAGAGCACCG 
TGTTGTGACAAAGCAAACGTGAAGAAAGGGCCTTGGTCTCCTGAGGAAGATGCAAAACTC 
AAATCTTACATTGAAAATAGTGGCACCGGAGGCAATTGGATCGCTTTGCCTCAAAAGATT 
GGTTTAAAGAGATGTGGAAAGAGTTGCAGGCTGAGGTGGCTTAACTATCTTAGACCAAAC 
ATCAAACATGGTGGCTTCTCTGAGGAAGAAGAAAACATCATTTGTAGCCTTTACCTTACA 
ATTGGTAGCAGGTGGTCTATAATCGCTGCTCAATTGCCGGGACGAACAGACAACGATATA 




iTTGTTTATTGGAGCTCCA 
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AAAAACTATTGGAACACGAGGCTCAAGAAGAAACTCATTAACAAACAACGCAAGGAGCTT 
CAAGAAGCTTGTATGGAGCAGCAAGAGATGATGGTGATGATGAAGAGACAACACCAACAA 
CAACAAATCCAAAC1TCTTTTATGATGAGACAAGACCAAA 

CATCATCATAATGTTCAAGTTCCAGCTCTTTTCAGAATCAAACCAACTCGTTTTGCGACC 

AAGAAGATGTTAAGCCAGTGCTCATCAAGAACATGGTCAAGATCGAAGATCAAGAACTGG 

AGAAAACAAACCTCATCATCATCAAGATTCAATGACAACGCTTTTGATCATCTCTCTTTC 

TCTCAACTCTTGTTAGATCCTAATCATAACCACTTAGGATCAGGAGAGGGTTTCTCCATG 

AACTCTATCTTGAGCGCCZAACACAAACTCTCCATTGCTTAACACAAGTAATGATAATCAG 

TGGTTCGGGAATTTCCAGGCCGAAACCGTAAACTTGTTCTCAGGAGCCTCCACAAGTACT 

TCGGCAGATCAAAGCACTATAAGTTGGGAAGACATAAGCTCTCTTGTTTATTCTGATTCA 

AAGCAATTTTTTTAATTATAATAATATATTATTCTTAAGATGAAACGTACATCATTAT 

TTAATTGGGGGTACGTAACGTATATATGGAATAACGATCTAGTTTGTTTAAATTTAAAA 

>G234 Amino Acid Sequence (domain in AA coordinates: 14-115) 

MGRAPCCDKANVKKGPWSPEEDAK1KSYIENSGTGGNWIALP 

YLRPNIKHGGFSEEEENIICSLYLTIGSRWSIIAAQLPGRTDOT^ 

QRKELQEACMEQQEMMVMMKRQHQQQQI QTS FMMRQDQTMFTWPLHHHNVQVPALFRI KP 
TRFATKKI^SQCSSRTWSRSKIK1JWRKQTSSSSR:FND^ 

EGFSI^SILSANTNSPLLNTSNDNQWFGNFQAETVNLFSGASTSTSADQSTISWEDISSL 

VYSDSKQFF* 

>G361 (54.. 647) 

TCTGTCTCTCTCTCTCTCTTTGTAAATATACATATATAGATAAGCTCACATATATGGCGA 
CTGAAACATCTTCTTTGAAGCTCTTCGGTATAAACCTACTTGAAACGACGTCGGTTCAAA 
ACCAGTCATCGGAACCAAGACCCGGATCCGGATCAGGATCCGAGTCACGTAAGTACGAGT 
GTCAATACTGTTGTAGAGAGTTTGCTAACTCTCAAGCTCTTGGTGGTCACCAAAACGCTC 
ACAAGAAAGAGCGTCAGCTTCTTAAACGTGCACAGATGTTAGCTACTCGTGGTTTGCCAC 
GTCATCATAATTTTCACCCTCATACCAATCCGCTTCTCTCCGCCTTCGCGCCGCTGCCTC 
ACCTCCTCTCTCAGCCGCATCCTCCGCCGCATATGATGCTCTCTCCTTCTTCTTCGAGTT 
CTAAGTGGCTTTACGGTGAACACATGTCGTCACAAAACGCCGTTGGGTACTTTCATGGTG 
GAAGGGGACTTTACGGAGGTGGCATGGAGTCTATGGCCGGAG7^AGTAAAGACTCATGGTG 
GTTCTTTGCCGGAGATGAGGAGGTTCGCCGGAGATAGTGATCGGAGTAGCGGAATTAAGT 
TAGAGAATGGTATTGGGCTGGACCTCCATTTAAGCCTTGGGCCATGAATGATTATAATTT 
TGGCCC^GTAAAGATCTGTAAAATACTACTAGGATTTCATTTTTATAGAGTATGTTTTTT 
TCCTTAATTTCGGTTGAAATTGGTGAATATTTTTATCTCTTACTTACC^IAATCTCATATT 
TCTATGTATGCGTTTGCTTTCACTTTTTTTTTTTATATAATTCTTCTTGTAAAAAATGCA 
ATGTGAGTTTTCTTCCCTATCATTCTGTCAAGCTTTGGTTCAATTATTTAGTAATCGAAT 
AATATAGGAATAGTGTTGAAAG 

>G361 Amino Acid Sequence (domain in AA coordinates: 43-63) 

MATETSSLKLFGINLLETTSVQNQSSEPRPGSGSGSESRKYECQYCCREFANSQALGGHQ 

NAHKKERQLLKRAQMLATRGLPRHHNFHPHTNP^^ 

S S SKWLYGEHMS SQNAVGYFHGGRGLYGGGMESMAGEVKTHGGSLPEMRRFAGDSDRS SG 

IKLENGIGLDLHLSLGP* 

>G562 (137.. 1285) 

ATTTGAATTTCTGGGTTTCTCTCTGTTTAAGCTTCTTCTTCTTCATCTTCTGCTTACGTT 
TCTTCTTCAAGGAGCTTTCGGATTCTTGTAGAAAGAGTCATTGTTCTCTTGAGTGGGAAA 
CCTTGAAACCATTCCTATGGGAAATAGCAGCGAGGAACCAAAGCCTCCTACCAAATCAGA 
TAAACCATCTTCACCCCCGGTGGATCAAACAAATGTTCATGTCTACCCTGATTGGGCAGC 
TATGCAGGCATATTATGGTCCAAGAGTAGCAATGCCTCCTTATTACAATTCAGCTATGGC 
TGCATCTGGTCATCeTCCTCCTCCTTACATGTGGAATCCTCAGCATATGATGTCACCATC 
TGGAGCACCCTATGCTGCTGTTTATCCTCATGGAGGAGGAGTTTACGCTCATCCCGGTAT 
TCCCATGGGATCACTGCCTCAAGGTCAAAAGGATCCACCTTTAACAACTCCGGGGACGCT 
TTTGAGCATCGACACTCCTACTAAATCTACAGGGAACACAGACAATGGATTGATGAAGAA 
GCTGAAAGAGTTTGATGGGCTTGCTATGTCTCTAGGAAATGGGAATCCTGAAAATGGTGC 
AGATGAACATAAACGATCACGGAACAGCTCAGAAACT^ATGGTTCTACTGATGGAAGTGA 
TGGGAATACAACTGGGGCAGATGAACCGAAACTTAAAAGAAGTCGAGAGGGAACTCCAAC 
AAAAGATGGGAAACAATTGGTTCAAGCTAGCTCATTTCATTCTGTTTCTCCGTCAAGTGG 
TGATACCGGCGTAAAACTCATTCAAGGATCTGGAGCTATACTCTCTCCTGGTGTAAGTGC 
AAATTCCAACCCCTTCATGTCACAATCTTTAGCCATGGTTCCTCCTGAAACTTGGCTTCA 
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GAACGAGAGAGAACTGAAACGGGAGCGAAGGAAACAGTCTAATAGAGAATCTGCTAGAAG 

GTCAAGATTAAGGAAACAGGCCGAGACAGAAGAACTTGCTAGGAAAGTGGAAGCCTTGAC 

AGCCGAAAACATGGCATTAAGATCTGAACTAAACCAACTTAATGAGAAATCTGATAAACT 

AAGAGGAGCAAATGCAACCTTGTTGGACAAACTGAAATGCTCGGAACCCGAAAAGAGAGT 

CCCCGCAAATATGTTGTCTAGAGTTAAGAACTCAGGAGCTGGAGATAAGAACAAGAACCA 

AGGAGACAATGATTCTAACTCTACAAGCAAATTCCATCAACTGCTCGATACGAAGCCTCG 

AGCTAAAGCAGTAGCTGCAGGCTGAATCGATGGTAATTCATGTCGATTTCTACTTAATTT 

GTCGACATAAACAAAGAAAATAAGTGCTACTAATTTCAGAAAAACTTGATAGATAGATAG 

TATAGTAGAGAGAGAGAGAGAGAGAGAGGTGTGATGATTATTGATCTATAAATTTTCGGA 

GAGAGAGAGGGAGAAAGAGAAACTTTTCCTCCAGATGAAAATTTGGTGTTATGGTTTGTT 

ACTGTTAATATAGAGAGGCTTTTCTTTTTTTATAAAATGGCTTCCTTTGTTGCA 

>G562 Amino Acid Sequence (domain in AA coordinates: 253-315) 

MGNSSEEPKPPTKSDKPSSPPVDQTNVHVYPDWAAMQAYYGPRVAMPPYYNSAMAASGHP 

PPPYMWNPQHMMSPSGAPYAAVYPHGGGVYAHPGIPMGSLPQGQKDPPLTTPGTLLSIDT 

PTKSTGNTDNGLMKKLKEFDGLAMSLGN^ 

ADEPKLKRSREGTPTKDGKQLVQASSFHSVSPSSGDTGVKLIQGSGAILSPGVSANSNPF 

MSQSLAIWPPETWLQNERELKRERRKQSNRESAI^SRLRKQAETEELAR 

LRSELNQLNEKSDKLRGANATLLDKIjKCSEPEKRVPANMLSRVKNSGAGDK^ 

NSTSKFHQLLDTKPRAKAVAAG* 

>G591 (88. .1020) 

GTAAATCTCTCTTTGAAGGTTCCTAACTCGTTAATCGTAACTCACAGTGACTCGTTCGAG 
TCAAAGTCTCTGTCTTTAGCTCAAACCATGGCTAGTAACAACCCTCACGACAACCTTTCT 
GACCAAACTCCTTCTGATGATTTCTTCGAGCAAATCCTCGGCCTTCCTAACTTCTCAGCC 
TCTTCTGCCGCCGGTTTATCTGGAGTTGACGGAGGATTAGGTGGTGGAGCACCGCCTATG 
ATGCTGCAGTTGGGTTCCGGAGAAGAAGGAAGTCACATGGGTGGCTTAGGAGGAAGTGGA 
CCAACTGGGTTTCACAATCAGATGTTTCCTTTGGGGTTAAGTCTTGATCAAGGGAAAGGA 
CCTGGGTTTCTTAGACCTGAAGGAGGACATGGAAGTGGGAAAAGATTCTCAGATGATGTT 
GTTGATAATCGATGTTCTTCTATGAAACCTGTTTTCCACGGGCAGCCTATGCAACAGCCA 
CCTCCATCGGCCCCACATCAGCCTACTTCAATCCGTCCCAGGGTTCGAGCTAGGCGTGGT 
CAGGCTACTGATCCACATAGCATCGCTGAGCGGCTACGTAGAGAAAGAATAGCAGAACGG 
ATCAGGGCGCTGCAGGAACTTGTACCTACTGTGAACAAGACCGATAGAGCTGCTATGATC 
GATGAGATTGTCGATTATGTAAAGTTTCTCAGGCTCCAAGTCAAGGTTTTGAGCATGAAC 
CGACTTGGTGGAGCCGGTGCGGTTGCTCCACTTGTTACTGATATGCCTCTTTCATCATCA 
GTTGAGGATGAAACGGGTGAGGGTGGAAGGACTCCGCAACCAGCGTGGGAGAAATGGTCT 
AACGATGGGACTGAACGTCAAGTGGCTAAACTGATGGAAGAGAACGTTGGAGCCGCGATG 
CAGCTTCTTCAATCAAAGGCTCTTTGTATGATGCCAATCTCATTGGCAATGGCAATTTAC 
CATTCTCAACCTCCGGATACATCTTCAGTGGTCAAGCCTGAGAACAATCCTCCACAGTAG 
GATTTCTGCAATAAAGAGTTTGTACAGCTAATCCAACTGTCCAAC^TGGGTTTTTCTTCT 
GCTCTAATGACTCTGGTTTCTTCTCTCCTCTCTCACCGACTTGAAAGGTAAAAAAGTGAA 
AAAGGCTTTGTAGATGGAATCAATGTAGGATTTGCAGTAGAGGGCAAAAAAATGTCATAT 
AGCTCAATTGATCAAGTCTTAAAAAAAAAAAAAAAAAAAA 

>G591 Amino Acid Sequence (domain in AA coordinates: 143-240) 

MASNNPHDNLSDQTPSDDFFEQILGLPNFSASSAAGLSGVDGGLGGGAPPMMLQLGSGEE 

GSHMGGLGGSGPTGFHNQMFPLGLSLDQGKGPGFLRPEGGHGSGKRFSDDWDNRCSSMK 

PVFHGQPMQQPPPSAPHQPTSIRPRVRARRGQATDPHSIAERLRRERIAERIRALQELVP 

TVNKTDRAAM IDE I VD YVKFLRLQVKVLSMlSn^LGGAGAVAPLVTDMPLS S S VEDETGEGG 

RTPQPAWEKWSNDGTERQVAKLMEENVGAAMQLLQSKALC^ 

WKPENNPPQ* 

>G8 (247.. 1596) 

AAAAAAAAATATCCGTCTCACTCTCTCGCCGCCGGTAACATTTCCCGGCGACAAAACTTC 
TCTACTCTCACCATTCCTCCATCGTAATCTCTAAATTCTTCTCCATTCTCTTCTTCCTCC 
CGATCATCTCGAGCTCTTCGTGAGAGATTATGTGATTATGTAATCGTTGTTGCTGTAGAA 
GACGATCTCTAACAACTGATTCCTTC7VTCATCACCTTCGCTAGATTTGTAATTTTCAGAG 
CTTGAGATGTTGGATCTTAACCTCAACGCTGATTCTCCCGAGTCGACTCAGTACGGTGGT 
GACTCATACTTAGATCGGCAGACATCAGACAACTCCGCCGGGAATCGAGTGGAAGAGTCC 
GGTACATCGACGTCGTCAGTTATCAATGCCGATGGAGACGAAGACTCTTGCTCTACTCGA 
GCTTTCACTCTCAGTTTCGATATTTTAAAAGTCGGAAGTAGTAGCGGCGGAGACGAAAGC 
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CCCGCCGCTTCAGCTTCCGTTACTAAAGAGTTTTTTCCGGTGAGTGGAGACTGTGGACAT 
CTACGAGATGTTGAAGGATCATC^VAGCTCTAGAAACTGGATAGATCTTTCTTTTGACCGT 
ATTGGTGACGGAGAAACGAAATTGGTAACTCCGGTTCCGACTCCGGCTCCGGTTCCGGCT 
CAGGTTAAAAAGAGTCGGAGAGGACCAAGGTCTAGAAGTTCACAGTATAGAGGAGTTACT 
TTTTATAGAAGAACTGGTCGATGGGAGTCACATATTTGGGATTGTGGGAAACAAGTTTAT 
TTAGGTGGTTTCGACACTGCTCATGCTGCAGCTAGAGCTTATGATCGAGCTGCTATTAAA 
TTTAGAGGTGTTGATGCTGATATCAACTTTACTCTTGGTGATTATGAGGAAGATATGAAA 
CAGGTACAAAACTTGAGTAAGGAAGAGTTTGTGCATATACTGCGTAGACAGAGCACGGGG 
TTTTCGCGGGGGAGTTCGAAGTATCGAGGGGTTACGTTACACAAATGTGGTAGATGGGAA 
GCTAGGATGGGGCAGTTTCTTGGTAAAAAGGCTTATGACAAGGCTGCAATCAACACTAAT 
GGTAGAGAAGCAGTCACGAACTTCGAGATGAGTTCATACCAAAATGAGATTAACTCTGAG 
AGCAATAACTCTGAGATTGACCTCAACTTGGGAATCTCTTTATCGACCGGTAATGCGCCA 
AAGCAAAATGGGAGGCTCTTTCACTTCCCTTCTAATACTTATGAAACTCAGCGTGGAGTT 
AGCTTGAGGATAGATAACGAATACATGGGAAAGCCGGTGAATACACCTCTTCCTTATGGA 
TCCTCGGATCATCGCCTTTACTGGAACGGAGCATGCCCGAGTTATAATAATCCCGCCGAG 
GGAAGAGCAACAGAAAAGAGAAGTGAAGCTGAAGGGATGATGAGTAACTGGGGATGGCAG 
AGACCGGGGOW^CAAGCGCCGTGAGACCGCAGCCACCGGGACCACAACCACCACCATTG 
TTCTCAGTTGCAGC^GCATCATCAGGATTCTCACATTTCCGGCCACAACCTCCCAATGAC 
AATGCAACACGTGGTTACTTTTATCCACACCCTTAACTTGTAAGGGGACATATGAGAGTT 
TTTTTACCATCTCTCTCTCTCTCAACACTCTAGTCCCCTTTCAAAAATGTCATTTGGGTT 
TTAGATTTTTCACATACAATGATCAATTTTTCC 

>G8 Amino Acid Sequence (domain in AA coordinates: 151-217, 243-296) 

MLDLNLNADSPESTQYGGDSYLDRQTSDNSAGNRVEESGTSTSSVINADGDEDSCSTRAF 

TLSFDILKVGSSSGGDESPAASASVTKEFFPVSGDCGHLRDVEGSSSSRNWIDLSFDRIG 

DGETKLVTPVPTPAPVPAQVKKSRRGPRS RS S Q YRGVTF YRRTGRWESH I WDCGKQVYLG 

GFDTAHAAARAYDRAAIKFRGVDADINFTLGDYEEDMKQVQNLSKEEFVHIIiRRQSTGFS 

RGS S KYRGVTLHKCGRWEARMGQFLGKKAYDKAAINTNGREAVTNFEMS S YQNE INSESN 

NS E IDLNLGI S L STGNAPKQNGRLFHFPSNTYETQRGVSLRIDNEYMGKPVNTPLP YGS S 

DHRLYWNGACPSYNNPAEGRATEKKSEAEGMMSNWGWQRPGQTSAVRPQPPGPQPPPIiFS 

VAAAS SGFSHFRPQPPNDNATRGYF YPHP * 

>G859 (162.. 752) 

GATTTGTCATTTTTTGTCTAGCCAAAJ^AAAAAAAAAAAAAGGAGAGAGAGAGAGAGAGA 
GAGAGAGAGAGAAACGAAGAAAAAAAAAGAAGCAAAAAACATTGTGGGTCTCCGGTGATT 
AGGATCAAATTAGGGCACCAGCCTTATCGGAGGAAGAAGCCATGGGTAGAAAAAAAGTCG 
AGATCAAGCGAATCGAGAACAAAAGTAGTCGACAAGTCACTTTCTCCAAACGACGCAATG 
GTCTCATCGAGAAAGCTCGACAACTTTCAATTCTCTGTGAATCTTCCATCGCTGTTCTCG 
TCGTCTCCGGCTCCGGAAAACTCTACAAGTCTGCCTCCGGTGACAACATGTCAAAGATCA 
TTGATCGTTACGAAATACATCATGCTGATGAACTTGAAGCCTTAGATCTTGCAGAAAAAA 
CTCGGAATTATCTGCCACTCAAAGAGTTACTAGAAATAGTCCAAAGCAAGCTTGAAGAAT 
CAAATGTCGATAATGCAAGTGTGGATACTTTAATTTCTCTGGAGGAACAGCTCGAGACTG 
CTCTGTCCGTAACTAGAGCTAGGAAGACAGAACTAATGATGGGGGAAGTGAAGTCCCTTC 
AAAAAACGGAGAACTTGCTGAGAGAAGAGAACCAGACTTTGGCTAGCCAGGTGGGGAAGA 
AGACGTTTCTGGTTATAGAAGGTGACAGAGGAATGTCATGGGAAAATGGCTCCGGCAACA 
AAGTACGGGAGACTCTTCCGCTGCTCAAGTAATCACCATCATCAACGGCTGAGCTTTCAC 
CTTAAACTTACAGCCTGATTCAGAAGTTTTTACAAATTTGTAAATTATAAAAAGCTTCAT 
AATAATCTCAACCTTTTTATCTTCCTCGCGCCAATGTGGAAATTAAGGTTAAAAATAAAA 
TAAAACAGAAGCTCATGCGAAAGAATTGTAAAACTAAGATAAAGCTATAGTAGATCTTTA 
TTGTACCTTCGTAGACGATATAAGATTTATTCGTGTGTTTGTCTTCCCCTCNAAAAAAAA 
AAAAAAAAAAAAAAAA 

>G859 Amino Acid Sequence (domain in AA coordinates: TBD) 

MGRKKVEIKRIENKSSRQVTFSKRRNGLIEKARQLSILCESSIAVLVVSGSGKLYKSASG 

Dl^SKIIDRYEIHHADELEALDIJ^KTRNYLPL^ 

EEQLETALSVTRARKTELMMGEVKSLQKTENLLREENQTLASQVGKKTFLVIEGDRGMSW 

ENGSGNKVRETLPLLK* 

>G878 (197.. 1738) 

CAAAAAAAATCTCTCCCATTAAAAGACTGCCCAAAGAAATATTTTATACAAAATGAAAGA 
GAGAAACACGACACGAATTTTGTATAATTAAGATTACACAAAAAAAAGTGTTAGAAAGAG 
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AAATATCTTCTTCTTTTTTCTGTGTGAGTTGGGTTTGTTAAAGTTTTATCCTTTTTGTTC 
TCAAAATCAAGAATCGATGGCGGAGAAGGAAGAAAAAGAACCATCGAAGTTAAAATCATC 
CACCGGAGTTTCACGGCCAACGATTTCACTACCTCCTCGACCGTTTGGTGAAATGTTTTT 
TAGCGGTGGCGTTGGATTTAGTCCTGGACCAATGACTCTCGTCTCAAATTTATTCTCTGA 
TCCTGATGAGTTCAAGTCTTTCTCTCAGCTTTTAGCTGGAGCTATGGCTTCTCCGGCGGC 
AGCTGCTGTTGCCGCCGCTGCTGTGGTTGCTACTGCTCATCATCAGACACCTGTGAGCTC 
TGTCGGTGATGGCGGTGGAAGCGGTGGTGATGTTGACCCGAGGTTTAAGCAGAGTAGACC 
AACGGGATTGATGATAACTCAACCACCGGGGATGTTTACTGTACCGCCGGGGTTAAGTCC 
GGCTACTCTTTTGGATTCTCCGAGCTTCTTTGGTCTTTTTTCACCTCTTCAGGGAACATT 
TGGTATGACACAT(^(^GCTTTAGCACAAGTC^CTGCACAAGCAGTTCAAGGCAATAA 
TGTTCATATGCAGCAATCACAACAATCTGAATATCCTTCTTCTACAC^CAAC^CTyV^ 
AGAACAACAACAAGCTTCATTGACTGAGATTCC^ 

GATTCGAGCCTCGGTTCAAGAAACATCGCAGGGTCAGAGAGAGACTTCGGAAATATCTGT 
CTTTGAGCATCGGTCACAGCCTCAAAATGCTGACAAACCAGCTGATGATGGATACAACTG 
GCGGAAATATGGGCAGAAGCAAGTGAAGGGGAGCGATTTTCCTCGGAGTTATTACAAATG 
TACGCATCCAGCTTGTCCTGTCAAGAAGAAAGTGGAGAGGTCACTCGATGGACAAGTAAC 
GGAAATCATCTACAAGGGTCAACACAATCATGAGCTTCCTCAAAAGCGCGGTAACAATAA 
CGGGAGTTGTAAAAGTTCTGATATTGCAAATCAGTTTCAAACAAGTAATAGCAGTCTCAA 
CAAGAGTAAGAGGGACCAGGAAACAAGCCAAGTTACAACAACAGAGCAGATGTCTGAAGC 
AAGTGATAGCGAGGAGGTTGGGAATGCAGAGACTAGTGTGGGAGAAAGACATGAGGATGA 
GCCTGATCCCAAGCGAAGAAATACAGAAGTTCGGGTTTCAGAACCAGTTGCTTCATCGCA 
TAGAACTGTGACAGAGCCTAGGATTATTGTCCAAACGACGAGTGAAGTTGACCTCTTAGA 
TGATGGATATAGGTGGCGCAAGTATGGTCAGAAAGTAGTCAAAGGAAATCCTTATCCGAG 
GAGCTACTATAAGTGTACAACACCAGATTGCGGAGTAAGGAAACATGTAGAGAGAGCAGC 
AACTGACCCAAAAGCTGTTGTAACAACATATGAAGGTAAACATAACCATGATGTTCCAGC 
TGCTAGAACCAGCAGCCATCAGTTAAGACCAAACAATCAACACAACACCTCAACGGTTAA 
CTTCT^ATCATCAACAGCCTGTTGCACGTTTAAGGCTTAAAGAAGAGCAAATCACTTGACA 
GAGAAGAAGAATACGACGGCGCTTGAGCTTTTGTGAGTTTAATGAATCTTCTTTTTGGTT 
AATGAACCTGTTTTTGTTGCCTCAAAACACCACAGGT^ 

TTACAGTTTCAAAAGGTATGTTCTTTTATTTCATGTTGGAATCTTCTGTGTAATCTTAAG 

AAGCTTTAGGAGGTAATGTAAAAAACCAGATTCAAAGTTATGCCCTTATGTGAATTCTTT 

TGTA(^TGGGATAAACAAAATTTACAGGTATCCTTTTTGTTCTTGTTGTAAAAAAAAAAA 
AAAA 

>G878 Amino Acid Sequence (domain in AA coordinates : 25 0-3 05, 415-475) 

MAEKBEKEPSKLKSSTGVSRPTISLPPRPFGEMFFSGGVGFSPGPMTLVSNLFSDPDEFK 

S FS QLLAGAMAS paaaavaaaawatahhqtp vs S VGDGGGSGGDVDPRFKQSRPTGLM I 

TQPPGMFTVPPGLS PATLLDS PS FFGLFS PLQGTFGMTHQQALAQVTAQAVQGNNVHMQQ 

SQQSEYPSSTQQQQQQQQQASLTEIPSFSSAPRSQIRASVQETSQGQRETSEISVFEHRS 

QPQNADKPADDGYNWRKYGQKQVKGSDFPRSYYKCTO^ 

GQHNHELPQKRGNNNGSCKSSDIANQFQTSNSSLNKSKRDQETSQVTTTEQMSEASDSE^ 

VGNAETSVGERHEDEPDPKRRNTEVRVSEPVASSHRTVTEPRIIVQTTSEVDLLDDGYRW 

RKYGQKVVKGNPYPRSYYKCTTPDCGVRKHVERAATD^ 

HQLRPNNQHNTSTVNFNHQQPVARLRLKEEQIT* 

>G971 (131.. 1171) 

TTTTTTTTCTTCCCTCTTTTAGAACTCTCTCTCTCTCTCGTTTTTGACACTTATC 

TCTTTTTTCTCTCTCCCTCTCTCTCTGGCCGGAAAAAAGAACAACGTCGTTTATAGCTAA 

AGATTCGATCATGTTGGATCTTAACCTAAAGATCTTTTCTTCTTATAACGAAGATCAAGA 

TCGGAAAGTACCATTAATGATCTCAACCACCGGTGAAGAAGAATCTAACTCATCTTCCTC 

CTCCACAACAGACTCTGCAGCGAGAGATGCTTTCATCGCTTTTGGAATTCTCAAACGCGA 

CGATGACCTTGTTCCTCCTCCTCCTCCTCCTCCTCATAAAGAAACAGGAGATCTCTTTCC 

GGTGGTGGCTGATGCTCGTCGGAATATAGAATTCTCCGTGGAAGACAGTCACTGGTTGAA 

TCTTTCTTCTTTACAAAGAAATACACAGAAAATGGTGAAGAAGAGCAGAAGAGGACCAAG 

GTCTCGTAGCTCCCAATATCGTGGCGTCACTTTTTACCGTCGCACCGGTCGTTGGGAATC 

TCATATTTGGGATTGTGGAAAGCAAGTTTATTTGGGCGGGTTTGATACTGCTTACGCAGC 

AGCAAGGGCTTACGACCGAGCTGCTATCAAATTCCGTGGTCTCGATGCAGACATCAATTT 

CGTCGTGGATGATTATAGGCATGACATCGATAAGATGAAGAATTTAAATAAGGTGGAGTT 

CGTGCAAACACTTAGGCGAGAGAGTGCGAGTTTCGGAAGAGGAAGTTCCAAATACAAAGG 
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CTTGGCTCTTCAAAAATG(^CCCAATTCAAAACTCATGATCAGATTCATCTCTTCCAA^ 

CAGGGGATGGGATGCAGCAGCAATAAAATACAATGAGTTGGGAAAGGGAGAAGGAGCCAT 

GAAGTTTGGTGCCCATATCAAAGGAAATGGTCACAATGATCTTGAACTAAGTCTCGGAAT 

TTCATCATCATCGGAAAGTATAAAGTTGACAACAGGCGATTACTATAAGGGTATCAATCG 

GTCCACGATGGGTTTATACGGTAAGCAATCATCGATATTTTTACCCATGGCAACCATGAA 

ACCTCTGAAGACAGTTGCAGCATCATCAGGATTCCCTTTTATCAGCATGAGAAGTTCCTC 

TTCCTCCATGTCCAATTGTTTTGATCCATAGGATCGTTCTACACTCTCTTAACTAATATA 

TATTTTTACTCTATCTGATTATTGTATACAAGGATAAAATTTGATTCTTTCCT^ 

TGAGAAATATTGGAAGTGTTAAAAAAAAAAAAAAAAAAAAAAA 

>G971 Amino Acid Sequence (conserved domain in aa coordinates: 120-186) 

MLDLNLKIFSSYNEDQDRKVPLMISTTGEEESNSSSSSTTDSAARDAFIAFGILKRDDDL 

VPPPPPPPHKETGDLFPWADARRNIEFSVISDSHWLNL^ 

SQYRGVTFYRRTGRWESHIWDCGKQVYLGGFDTAYAAAI^YDRAAIKFRGLDADINFVVD 

DYRHDIDKMKNLNKVEFVQTLRRESASFGRGSSKYKGLALQKCTQFKTHD 

DAAAI KYNELGKGEGAMKFGAH I KGNGHNDLELS LG I S S S S ES I KLTTGDYYKGINRSTM 

GLYGKQS S I FLPMATMKPLKTVAAS SGFPFI SMTS S S S SMSNCFDP * 

>G975 (58.-657) 

ATTACTCATCATCAAGTTCCTACTTTCTCTCTGACAAACATCACAGAGTAAGTAAGAATG 

GTACAGACGAAGAAGTTCAGAGGTGTCAGGCAACGCCATTGGGGTTCTTGGGTCGCTGAG 

ATT CGTCATCCTCTCTTGAAACGGAGGATTTGGCTAGGGACGTTCGAGACCG CAGAGGAG 

GCAGCAAGAGCATACGACGAGGCCGCCGTTTTAATGAGCGGCCGCAACGCC2VAAACCAAC 

TTTCCCCTCAACAACAACAACACCGGAGAAACTTCCGAGGGCAAAACCGATATTTCAGCT 

TCGTCCACAATGTCATCCTCAACATCATCTTCATCGCTCTCTTCCATCCTCAGCGCCAAA 

CTGAGGAAATGCTGCAAGTCTCCTTCCCCATCCCTCACCTGCCTCCGTCTTGACACAGCC 

AGCTCCCATATCGGCGTCTGGCAGAAACGGGCCGGTTCAAAGTCTGACTCCAGCTGGGTC 

ATGACGGTGGAGCTAGGTCCCGCAAGCTCCTCCCAAGAGACTACTAGTAAAGCTTCACAA 

GACGCTATTCTTGCTCCGACCACTGAAGTTGAAATTGGTGGCAGCAGAGAAGAAGTATTG 

GATGAGGAAGAAAAGGTTG CTTTGCAAATG ATAGAGG AGCTTCT CAATACAAACTAAATC 

TTATTTGCTTATATATATGTACCTATTTTCATTGCTGATTTACAGCCAAAATAATCAATT 

ATACCGTGTATTTTATAGATGTTTTATATTAAAAGGTTGTTAGATATA 

>G975 Amino Acid Sequence (domain in AA coordinates: 4-71) 

MVQTKKFRGVRQRHWGS WVAEIRHPLLKRRI WLGTFETAEEAARAYDEAAVLMSGRNAKT 

NFPLNWNNTGETSEGKTDISASSTMSSSTSSSSLSSILSAKLRKCCKSPSPSLTCLRLDT 

ASSHIGWQKRAGSKSDSSWVWTV^LGPASSSQETTSKASQDAII^TTEVEIGGSRE^ 

LDEEEKVALQMIEELLNTN* 

>G994 (180. ,917) 

TGTATATATAGTTAGTTAGTTGAGATAAACTTGGTTACCACTTTTGTGTGGTCTTTCTTT 
TTCTTTTTCTCCATTTTCCATTTATCGACCCCTTGGGTGTAGCTAATTACTTTCGCGATT 
TTCAAATCCAATAAAGTTTTAATTTGATGAAGCTTTTTTTAAACCATATAATATAAATAA 
TGGGTGGTCGTAAACCATGTTGTGATGAGGTTGGATTAAGAAAGGGTCCATGGACAGTGG 
AAGAAGATGGGAAACTAGTTGATTTCTTAAGGGCACGTGGCAACTGCGGTGGTGGTGGAG 
GAGGATGGTGCTGGAGAGACGTGCCAAAACTGGCGGGGCTAAGGAGGTGTGGCAAAAGTT 
GCCGTCTCCGGTGGACTAATTATCTCCGGCCAGATCTCAAGAGAGGTCTTTTTACTGAAG 
AAGAAATCCAACTAGTCATTGATCTTCATGCTCGCCTTGGCAATAGATGGTCGAAGATTG 
CAGTGGAGTTACCAGGAAGAACAGACAACGATATCAAAAATTATTGGAACACTCATATAA 
AGAGGAAGCTTATAAGAATGGGTATTGATCCAAACACACATCGTCGATTTGACCAACAAA 
AAGTCAACGAGGAGGAAACGATATTGGTCAACGATCCAAAGCCTCTGTCTGAGACCGAGG 
TATCTGTTGCTTTGAAGAATGACACGTCAGCAGTGTTATCAGGAAATCTAAACCAATTGG 
CTGACGTGGACGGTGATGATCAGCCGTGGAGCTTTCTAATGGAAAATGACGAAGGAGGAG 
GTGGCGACGCCGCCGGAGAGCTTACGATGCTATTGTCCGGTGACATTACGTCATCATGTT 
CTTCTTCGTCATCTTTGTGGATGAAGTATGGAGAATTCGGATACGAAGATTTAGAACTTG 
GATGTTTCGATGTTTAGAGATTCAAGTATGTTTAATTAGGCCGTAGGTTGATTAATCATA 
AGGTTCATTGACTTCATTCTAGAATTGTGTAGTTGGACCAGTATAAAGAATCAAAGTTAT 
GAAACATTGTAATTTGATTTCCAAATTAATCTAATGAATAAATGTGCTTTGCAAAAAAAA 
AAAAAAAAAAAAAAA 

>G994 Amino Acid Sequence (domain in AA coordinates: 14-123) 
MGGRKPCCDEVGLRKGPWTVEEDGKLVDFLRARGNCGGGGGGWCWRDVPKLAGLRRCGKS 
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CRIjRWTNYLRPDLKRGLFTEEEIQLVIDLHARLGNRWSKI^ 
KRKLIRMGIDPNTHRRFDQQKVNEEETILVNDPKPLSE 

ADVDGDDQPWSFLMENDEGGGGDAAGELTMLLSGDITSSCSSSSSLWMKYGEFGYEDLEL 
GCFDV* 

>G2347 (81.. 626) 

AGCCCATCCTTCAACATTGCTTCCTAACCAGAAATCCACCATCATCTTCCCACGAATACA 
ACTTAAAGCTTTACCAGAAAATGGAGGGTCAGAGAACACAACGCCGGGGTTACTTGAAAG 
ACAAGGCTACAGTCTCCAACCTTGTTGAAGAAGAAATGGAGAATGGCATGGATGGAGAAG 
AGGAGGATGGAGGAGACGAAGACAAAAGGAAGAAGGTGATGGAAAGAGTTAGAGGTCCTA 
GCACTGACCGTGTTCCATCG CGACTGTGCCAGGTCGATAGGTG CACTGTTAATTTGACTG 
AGGCCAAGCAGTATTACCGCAGACACAGAGTATGTGAAGTACATGCAAAGGCATCTGCTG 
CGACTGTTGCAGGGGTCAGGCAACGCTTTTGTCAACAATGCAGCAGGTTTCATGAGCTAC 
CAGAGTTTGATGAAGCTAAAAGAAGCTGCAGGAGGCGCTTAGCTGGACACAATGAGAGGA 
GGAGGAAGATCTCTGGTGACAGTTTTGGAGAAGGGTCAGGCCGGAGAGGGTTTAGCGGTC 
AACTGATCCAGACTCAAGAAAGAAACAGGGTAGACAGGAAACTTCCTATGACCAACTCAT 
CATTCAAGCGACCACAGATCAGATAAACCCTCCCGCTCTCTCTCTTCTGTCATCTACATA 
TGCTCTATCTACACTCTTATTAGACAAATAATGGCATCTAACAATGTCAAGAAAAGTTGG 
TCATGGTATTAAATCCTACACGGATATATAACTATAAACCTCTAGTCCCCTCTATGCTGT 
CCTGTAATGAATATCTATCCGGAAATGTATTCGCATAGTCTTGCGTCTAATAATGTTTAT 
TGATTTTGTA 

>G234 7 Amino Acid Sequence (domain in AA coordinates: 60-136) 

MEGQRTQRRGYLKDKATVSNLVEEEMENGMDGEEEDGGDEDKRKKVMERVRGPSTDRVPS 

RLCQVDRCTVNLTEAKQYYRRHRVCEVHAKASAATVAGVRQRFCQQCSRFHELPEFDEAK 

RSCRRRLAGHNERRRKISGDSFGEGSGRRGFSGQLIQTQERNRVDRKLPMTNSSFKRPQI 

R* 

>G2010 (1..525) 

ATGGAGGGTAAGAGATCACAAGGACAAGGTTACATGAAAAAGAAGTCTTACCTTGTGGAA 

GAAGATATGGAGACTGATACGGATGAAGAAGAGGAAGTAGGTAGGGATAGAGTTAGAGGG 

TCTAGAGGTAGCATCAATCGTGGTGGCTCGTTGCGGCTTTGCCAAGTAGATAGATGCACA 

GCTGATATGAAAGAGGCAAAACTGTATCACCGGAGACACAAAGTGTGTGAAGTTCATGCA 

AAGG(^TCTTCTGTCTTTCTCTCAGGACTTAACCAACGCTTTTGTCAACAATGCAGTAGG 

TTTCATGACCTCCAAGAGTTTGATGAAGCTAAGAGAAGTTGCAGGAGGCGCTTAGCTGGA 

CACAATGAGCGAAGAAGGAAGAGCTCTGGTGAGAGTACTTATGGAGAAGGATCAGGTCGG 

AGAGGAATCAATGGTCAGGTGGTGATGCAGAATCAAGAAAGATCAAGGGTAGAGATGACA 

CTTCCTATGCCAAACTCATCATTCAAGCGACCACAGATTAGATAG 

>G2010 Amino Acid Sequence (domain in AA coordinates: 53-127) 

MEGKRSQGQGYMKKKSYLVEEDMETDTDEEEEVGRDRVRGSRGSINRGGSLRIiCQVDRCT 

ADMKEAKLYHRRHKVCEVHAKASSVFLSGLNQRFCQQCSRFHDLQEFDEAKRSCRRRLAG 

HNERRRKSSGESTYGEGSGRRGINGQWMQNTQERSRVEMTLPMPNSSFKRPQIR* 
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